Category: Containerisation

Alpine Python Docker Base image problem with gcc

An important thing to do with process based containers or any container is to keep them slim and ensure that only what is necessary is packaged into the image.

For that reason I went with the python:3.8-alpine base image. After all was said and done the size of the resulting image was 208MB.

No GCC in that Base Image

Although I needed another python package and this package needed gcc, as shown by this error message in the build process:


    running build_ext
    building 'Cryptodome.Hash._MD2' extension
    creating build/temp.linux-x86_64-3.8
    creating build/temp.linux-x86_64-3.8/src
    gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -DTHREAD_STACK_SIZE=0x100000 -fPIC -DPYCRYPTO_LITTLE_ENDIAN -DSYS_BITS=64 -DLTC_NO_ASM -Isrc/ -I/usr/local/include/python3.8 -c src/MD2.c -o build/temp.linux-x86_64-3.8/src/MD2.o
    unable to execute 'gcc': No such file or directory
    error: command 'gcc' failed with exit status 1

The Wasteful Solution

There is an easy way to fix this problem...use a base image that has gcc prepackaged. I used python:3.8 and it just worked.

However that came at a price, the image was now 1.12GB in size.

So I looked around and it seemed ok to just install gcc with apk.

alpine-linux

The Ideal Solution

I reverted back to using python:3.8-alpine and installed gcc and my dependencies in one line:


RUN apk add --no-cache --virtual .build-deps gcc musl-dev \
    && pip install --no-cache-dir -r /code/requirements.txt \
    && apk del .build-deps

Now the image built correctly and the size was 301MB

docker-python-image

The ideal solution might not even be this though, as there is the suggestion of multi-stage builds. A Docker image just to build the project and a seperate image just to run.

Batteries Included

I think it comes down to batteries included or not.

I'm also not a fan of having too many commands in your dockerfile, it feels too sysadminy.

But use you descretion - horses for courses.

Containerising your Django Application into Docker and eventually Kubernetes

There shift to containers is happening, in some places faster than others...

People underestimate the complexity and all the parts involved in making you applciation work.

The Django Example

In the case of Django, we would in the past (traditionally) deployed it on a webserver running:

  • a Webserver (nginx)
  • a python wsgi - web server gateway interface (gunicorn or uwsgi)
  • a Database (sqlite, mySQL or Postgres)
  • Sendmail
  • Maybe some other stuff: redis for cache and user session

So the server would become a snowflake very quickly as it needs to do multiple things and must be configured to communicate with multiple things.

It violates the single responsibility principle.

But, we did understand it that way. Now there is a bit of a mind shift when docker is brought in.

The key principle is:

Be stateless, kill your servers almost every day

Taken from Node Best Practices

So what does that mean for out Django Application?

Well, we have to think differently. Now for each process we are running we need to decide if it is stateless or stateful.

If it is stateful (not ephemeral) then it should be set aside and run in a traditional manner (or run by a cloud provider). In our case the stateful part is luckily only the database. When a say stateful I mean the state needs to persisit...forever. User session, cache and emails do need to work and persist for shorter time periods - it won't be a total disaster if they fail. User's will just need to reauth.

So all the other parts that can all run on containers are:

  • Nginx
  • Gunicorn
  • Sendmail

For simplicity sake I'm going to gloss over redis as cache and user session. I'm also not that keen to include sendmail because it introduces more complexity and another component - namely message queues.

Lets start Containerising our Django Application

Alright so I'm assuming that you know python and django pretty well and have at least deployed a django app into production (the traditional way).

So we have all the code, we just need to get it runnning in a container - locally.

A good resource to use is ruddra's docker-django repo. You can use some of his Dockerfile examples.

First install docker engine

Let's get it running in docker using just a docker file. Create a file called Dockerfile in the root of the project.


# pull official base image - set the exact version of python
FROM python:3.8.0

LABEL maintainer="Your Name <your@email.com>"

# set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

# Install dependencies
RUN pip install --no-cache-dir -U pip

# Set the user to run the project as, do not run as root
RUN useradd --create-home code
WORKDIR /home/code
USER code

COPY path/to/requirements.txt /tmp/
RUN pip install --user --no-cache-dir -r /tmp/requirements.txt

# Copy Project
COPY . /home/code/

# Documentation from person who built the image to person running the container
EXPOSE 8000

CMD python manage.py runserver 0.0.0.0:8000

A reference on the Dockerfile commands

Remember to update the settings of the project so that:

ALLOWED_HOSTS = ['127.0.0.1', '0.0.0.0']

Now let us build the image and run it:


docker build . -t company/project
docker run -p 8000:8000 -i --name project -t company/project --name project

Now everthing should just work!...go to: http://0.0.0.0:8000>/code>

 

Pros and Cons of OKD, Portainer and Kubernetes and Docker Swarm Mode Underlying

If you are about to make a choice about which container orchestrator, sheduler or paltform to choose you have a tough decision ahead. It is difficult to understand the nuances, features and limitations of each platform without having tried them out for a while. I am focusing on your own cluster - ie. not using a public cloud provider like Amazon EKS or Azure Container Service.

So in this short post, I'm going to tell you the main issues and features to look out for.

Openshift OKD

Kubernetes optimized for continuous application development and multi-tenant deployment.

Issues / Cons

  • Bad developer experience as containers don't just work - as they cannot run as root and run as a random user - some containers can only run as root.
  • Your containers need to be specifically built for Openshift (sometime)
  • More secure and less prone to vulnerabilities as containers will never run as root user
  • Frontend can be a bit tricky

Features / Pros

  • Like most Red Hat things - great docs and well designed and stable product
  • Good default - sets up a docker registry for you - using kubernetes in a container
  • Everything is managed within OKD - routes are automatically setup - no need to open ports and point stuff all around the place
  • No support for docker-compose as kubernetes has their own way of doing things - you have to convert your docker compose files.
  • Frontend can be nice and simple

Portainer

Making Docker Management Easy - it uses Docker Swarm - no kubernetes.

Issues / Cons

  • No routes (via domain names) - you have to manually configure and manage ports and routes to the containers

Features / Pros

  • Docker developer friendly - containers just work
  • Containers run as root