Category: Containerisation

Kubernetes Questions – Please answer them

What is the Difference between a Persistent Volume and a Storage Class?

What happens when pods are killed, is the data persisted - How do you test this?

What is the difference between a Service and an Ingress?

By default, Docker uses host-private networking, so containers can talk to other containers only if they are on the same machine.

If you check the pod ip and there is an open containerPort then you should be able to access it via the node - with curl.

What happens when a node dies? The pods die with it, and the Deployment will create new ones, with different IPs. This is the problem a Service solves.

A Kubernetes Service is an abstraction which defines a logical set of Pods running somewhere in your cluster, that all provide the same functionality

When created, each Service is assigned a unique IP address (also called clusterIP)

This address is tied to the lifespan of the Service, and will not change while the Service is alive

communication to the Service will be automatically load-balanced

  • targetPort: is the port the container accepts traffic on
  • port: is the abstracted Service port, which can be any port other pods use to access the Service

Note that the Service IP is completely virtual, it never hits the wire

Kubernetes supports 2 primary modes of finding a Service - environment variables and DNS - DNS requires a COreDNS addon

Ingress is...

An API object that manages external access to the services in a cluster, typically HTTP

How do you know the size of the PV's to create for the PVC's of a helm chart?

Are helm chart declarative or imperitive?

What is a kubernetes operator?

How do you start a new mysql docker container with an existing data directory?

Usage against an existing databaseIf you start your mysql container instance with a data directory that already contains a database (specifically, a mysql subdirectory), the $MYSQL_ROOT_PASSWORD variable should be omitted from the run command line; it will in any case be ignored, and the pre-existing database will not be changed in any way.

The above did not work for me.

trademate-db_1   | Initializing database
trademate-db_1   | 2020-01-16T05:59:38.689547Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
trademate-db_1   | 2020-01-16T05:59:38.690778Z 0 [ERROR] --initialize specified but the data directory has files in it. Aborting.
trademate-db_1   | 2020-01-16T05:59:38.690818Z 0 [ERROR] Aborting
trademate-db_1   | 
trademate_trademate-db_1 exited with code 1

How do you view the contents of a docker volume?

https://docs.docker.com/engine/reference/commandline/volume_inspect/

You can't do this without a container for named volumes (the one's docker manages). So kak...https://stackoverflow.com/questions/34803466/how-to-list-the-content-of-a-named-volume-in-docker-1-9

How do you run python debugger and attach inside a docker container?

Its a mess up...just develop locally

If you were mounting a conf file into an nginx image from docker-compose - how do you do that in production? Do you bake it into the image?

Yes you should.

Do something like this:

FROM nginx:1.17

COPY ./config/nginx/conf.d /etc/nginx/conf.d

# Remove default config
RUN rm /etc/nginx/conf.d/default.conf 

How do you deploy all the k8s spec files in a folder at once? If not is there a specific order to deploy them in?

This Service should exists before the replicas - as it adds environment variables to containers on the pods in the replicaset based on services created.

Should gunicorn and nginx containers be in the same pod?

Docker Saving 361MB on an Image File with dockerignore

Initially I created a Dockerfile to run my django app. I chose python alpine to save on image size.

There were a few issues with that but that is fixed in the below Dockerfile

FROM python:3.8-alpine

RUN mkdir -p /code/requirements
WORKDIR /code

RUN pip install --upgrade pip --no-cache-dir

# Installing requirements.txt from project
COPY ./requirements/*.txt /code/requirements/.
RUN apk add --no-cache --virtual .build-deps gcc libffi-dev openssl-dev musl-dev mariadb-dev \
    && pip install --no-cache-dir -r /code/requirements/production.txt \
    && apk del .build-deps gcc libffi-dev musl-dev openssl-dev mariadb-dev

COPY . /code/

# Give access to non root user
RUN rm -rf /code/.git* && \
    chown -R 1001 /code && \
    chgrp -R 0 /code && \
    chmod -R g+w /code

USER 1001

EXPOSE 8000

CMD ["sh", "-c", "python manage.py collectstatic --no-input; python manage.py migrate; python manage.py runserver 0.0.0.0:8000"]

The image size is a whopping 536MB

    REPOSITORY                                                                   TAG                 IMAGE ID            CREATED             SIZE
    trademate-app                                                                latest              c68e2e23464f        40 seconds ago      536MB

After checking the history:

$ docker history trademate-app 
IMAGE               CREATED              CREATED BY                                      SIZE                COMMENT
6f2ed6c9d629        About a minute ago   /bin/sh -c #(nop)  CMD ["sh" "-c" "python ma…   0B                  
da5c27376179        About a minute ago   /bin/sh -c #(nop)  EXPOSE 8000                  0B                  
49e374f1bd68        About a minute ago   /bin/sh -c #(nop)  USER 1001                    0B                  
f705cd2c2415        About a minute ago   /bin/sh -c rm -rf /code/.git* &&     chown -…   189MB               
d48349f175fa        2 minutes ago        /bin/sh -c #(nop) COPY dir:47dc9677cabb09e63…   193MB               
848177bf54b2        2 minutes ago        /bin/sh -c apk add --no-cache --virtual .bui…   37.9MB              
b5ab3803fe68        8 minutes ago        /bin/sh -c #(nop) COPY file:7b8d1a2c6c47119d…   215B                
ffe7a4e57a38        8 minutes ago        /bin/sh -c pip install --upgrade pip --no-ca…   5.03MB              
d7e45c46def5        8 minutes ago        /bin/sh -c #(nop) WORKDIR /code                 0B                  
8278e34b58a3        8 minutes ago        /bin/sh -c mkdir -p /code/requirements          0B                  
204216b3821e        2 months ago         /bin/sh -c #(nop)  CMD ["python3"]              0B                  
<missing>           2 months ago         /bin/sh -c set -ex;   wget -O get-pip.py "$P…   6.24MB              
<missing>           2 months ago         /bin/sh -c #(nop)  ENV PYTHON_GET_PIP_SHA256…   0B                  
<missing>           2 months ago         /bin/sh -c #(nop)  ENV PYTHON_GET_PIP_URL=ht…   0B                  
<missing>           2 months ago         /bin/sh -c #(nop)  ENV PYTHON_PIP_VERSION=19…   0B                  
<missing>           2 months ago         /bin/sh -c cd /usr/local/bin  && ln -s idle3…   32B                 
<missing>           2 months ago         /bin/sh -c set -ex  && apk add --no-cache --…   98.6MB              
<missing>           2 months ago         /bin/sh -c #(nop)  ENV PYTHON_VERSION=3.8.0     0B                  
<missing>           2 months ago         /bin/sh -c #(nop)  ENV GPG_KEY=E3FF2839C048B…   0B                  
<missing>           2 months ago         /bin/sh -c apk add --no-cache ca-certificates   551kB               
<missing>           2 months ago         /bin/sh -c #(nop)  ENV LANG=C.UTF-8             0B                  
<missing>           2 months ago         /bin/sh -c #(nop)  ENV PATH=/usr/local/bin:/…   0B                  
<missing>           2 months ago         /bin/sh -c #(nop)  CMD ["/bin/sh"]              0B                  
<missing>           2 months ago         /bin/sh -c #(nop) ADD file:fe1f09249227e2da2…   5.55MB

So you can see the removal of the git folder and env folder was screwing me along with the copy.

So I added a .dockerignore file containing:

.git
.cache
.geckodriver.log
.vscode
env/

and removing the rm -rf /code/.git* command.

The image is now 175 MB

REPOSITORY                                                                   TAG                 IMAGE ID            CREATED             SIZE
trademate-app                                                                latest              0040fef600ad        8 minutes ago       175MB

Saving a whopping 361MB

Any other improvements would be welcomed...send me an email.

Modernising Applications – my opinion

Over the past year I've taken a step back and evaluated how we build and deploy web applications.
When we talk about modernising web applications people immediately jump to the cloud-native catch phrase.

It ramps up pretty quick, it spirels quickly into kubernetes, docker, Paas, Ingress, load balancing, service mesh and site reliability engineering. For most people there is very little meaning and value in these terms, just hype and complexity.

For me it comes down to a few things:

  • minimising costs - utilising resources more effectively - better isolation
  • authenticating and authorizing easier and in a more future-proof way
  • making our web apps more reliable
  • making our web apps easier to build and maintain
  • making our web apps easier to deploy
  • are our lives easier?
  • are our outcomes achieved faster?

I am not so sure that the cloud native and microservices approach which is part-and-parcel to it has achieved any of the above points.

Minimising Costs, Efficient resource usage and Isolation

I like the ability to run multiple applications on a cluster or node and have the scheduling of this done for you.
I also like that all is isolated...you don't need to deploy multiple applications on a single vm and then use virtualhost or server blocks.

That also adds to the reliability and deployment part.

Although learning about containers and kubernetes, also deploying your own cluster and managing that requires many hours of learning (like 500+ hours).
It also needs many hours of application. The subject matter also moves fasts and deprecates things so there is not a single way of doing something or a kata that you can follow.

Also at the end of the day your kubernetes cluster is on nodes, the nodes are most likely vms (unless hidden from you by your cloud provider).
So you are still on computers and the computers cost money...utilisation may also remain low based on your app usage.

Most applications never need to scale and can therefore avoid the complexity.

stackoverflow-load-balancer-centos

I would doubt that any website in South Africa gets more than 300M hits a day.
So a lot of the shit is overkill.

Authenticating and authorizing easier

This isn't part of the cloud native foundation and there isn't much clarity here.
There isn't a clear winner but this is a very important thing to get right espescially when organisating a sweet of applications or microservices.

You should deletegate or outsource your authnetication and authorisation from your application.
It should be done at one single (HA) source.

Certainly LDAP is much better than authenticating with your specific frameworks authntication method - as their is a single source.
Even better than that is OAuth and when you add an identity to OAuth you get OpenID Connect. You delegate authentication in your applciation to something like Keycloak and let it handle all auth and authorization related stuff.

You are not a hero...let someone else do this hard and ubiquitous job for you.

Searching online it looks like istio and linkerd (the service mesh frontrunners) are trying to muscle in on auth for your services.

Making our web apps more reliable

From my experience so far, using k8s too deploy a project initially and scale it up and down hasn't added to reliability.
It has taken away from it.

On one of my django projects the vm has been running for a very long time:

14:39:38 up 610 days, 4:58, 1 user, load average: 0.00, 0.01, 0.05

Availability has been always (100%) except for like a second during deploys.

Building your application in a lightweight and performance based way without excessive bloat and crap is still the number 1 thing you can do.

Making our web apps easier to build and maintain

I think apps have become harder to build and maintain.

When you build a python based django or flask site you can write a few lines going through their tutorial and be up and running with a website in no time at all.

When doing this with a container it is longer and you need to make tough choices, like the base image, environment variables, externalising the db from sqlite.

On top of that it is hard to debug. Your standard debugger like ipdb and debugbar are hard to configure to work in containers...also depending on the base image you cant just access the container and run commands.
You need to read the logs and then make changes.
A kak developer experience and the tooling is not up to stratch.

Also it is alot to learn (as I mentioned before) and apply. Just getting your application into an image is tough, you then need to deploy that image to a private registry.

Then you need to write the manifest or spec for k8s to access the registry and then create the deployment/replicaset or pods for the app.

It is messed up.

Perhaps in the long run it is easier but doesn't feel like that now.

Making our web apps easier to deploy

It is a total mind shift to deploy containerised apps.
You need a registry and need to set environment variables - so that like the 12 factor app principles your test, staging and production images are very similar.

Easier said than done.

I still need alot to do here like look at jenkinsx or gitlab on how this is to be done (cause jenkins old has got a bad rep).

I will wait until I've taken a few more steps to commend on this.

Are our lives easier

Fuck no.

We aren't google...but apparently we need to be google cause our small 10 hits a day website is now google level priority for site reliability and scale.

We've also had to learn so much stuff for so long and we aren't closer to answering some basic questions.

are our outcomes achieved faster

Don't know, don't think so.