Category: Docker

Docker Saving 361MB on an Image File with dockerignore

Initially I created a Dockerfile to run my django app. I chose python alpine to save on image size.

There were a few issues with that but that is fixed in the below Dockerfile

FROM python:3.8-alpine

RUN mkdir -p /code/requirements
WORKDIR /code

RUN pip install --upgrade pip --no-cache-dir

# Installing requirements.txt from project
COPY ./requirements/*.txt /code/requirements/.
RUN apk add --no-cache --virtual .build-deps gcc libffi-dev openssl-dev musl-dev mariadb-dev \
    && pip install --no-cache-dir -r /code/requirements/production.txt \
    && apk del .build-deps gcc libffi-dev musl-dev openssl-dev mariadb-dev

COPY . /code/

# Give access to non root user
RUN rm -rf /code/.git* && \
    chown -R 1001 /code && \
    chgrp -R 0 /code && \
    chmod -R g+w /code

USER 1001

EXPOSE 8000

CMD ["sh", "-c", "python manage.py collectstatic --no-input; python manage.py migrate; python manage.py runserver 0.0.0.0:8000"]

The image size is a whopping 536MB

    REPOSITORY                                                                   TAG                 IMAGE ID            CREATED             SIZE
    trademate-app                                                                latest              c68e2e23464f        40 seconds ago      536MB

After checking the history:

$ docker history trademate-app 
IMAGE               CREATED              CREATED BY                                      SIZE                COMMENT
6f2ed6c9d629        About a minute ago   /bin/sh -c #(nop)  CMD ["sh" "-c" "python ma…   0B                  
da5c27376179        About a minute ago   /bin/sh -c #(nop)  EXPOSE 8000                  0B                  
49e374f1bd68        About a minute ago   /bin/sh -c #(nop)  USER 1001                    0B                  
f705cd2c2415        About a minute ago   /bin/sh -c rm -rf /code/.git* &&     chown -…   189MB               
d48349f175fa        2 minutes ago        /bin/sh -c #(nop) COPY dir:47dc9677cabb09e63…   193MB               
848177bf54b2        2 minutes ago        /bin/sh -c apk add --no-cache --virtual .bui…   37.9MB              
b5ab3803fe68        8 minutes ago        /bin/sh -c #(nop) COPY file:7b8d1a2c6c47119d…   215B                
ffe7a4e57a38        8 minutes ago        /bin/sh -c pip install --upgrade pip --no-ca…   5.03MB              
d7e45c46def5        8 minutes ago        /bin/sh -c #(nop) WORKDIR /code                 0B                  
8278e34b58a3        8 minutes ago        /bin/sh -c mkdir -p /code/requirements          0B                  
204216b3821e        2 months ago         /bin/sh -c #(nop)  CMD ["python3"]              0B                  
<missing>           2 months ago         /bin/sh -c set -ex;   wget -O get-pip.py "$P…   6.24MB              
<missing>           2 months ago         /bin/sh -c #(nop)  ENV PYTHON_GET_PIP_SHA256…   0B                  
<missing>           2 months ago         /bin/sh -c #(nop)  ENV PYTHON_GET_PIP_URL=ht…   0B                  
<missing>           2 months ago         /bin/sh -c #(nop)  ENV PYTHON_PIP_VERSION=19…   0B                  
<missing>           2 months ago         /bin/sh -c cd /usr/local/bin  && ln -s idle3…   32B                 
<missing>           2 months ago         /bin/sh -c set -ex  && apk add --no-cache --…   98.6MB              
<missing>           2 months ago         /bin/sh -c #(nop)  ENV PYTHON_VERSION=3.8.0     0B                  
<missing>           2 months ago         /bin/sh -c #(nop)  ENV GPG_KEY=E3FF2839C048B…   0B                  
<missing>           2 months ago         /bin/sh -c apk add --no-cache ca-certificates   551kB               
<missing>           2 months ago         /bin/sh -c #(nop)  ENV LANG=C.UTF-8             0B                  
<missing>           2 months ago         /bin/sh -c #(nop)  ENV PATH=/usr/local/bin:/…   0B                  
<missing>           2 months ago         /bin/sh -c #(nop)  CMD ["/bin/sh"]              0B                  
<missing>           2 months ago         /bin/sh -c #(nop) ADD file:fe1f09249227e2da2…   5.55MB

So you can see the removal of the git folder and env folder was screwing me along with the copy.

So I added a .dockerignore file containing:

.git
.cache
.geckodriver.log
.vscode
env/

and removing the rm -rf /code/.git* command.

The image is now 175 MB

REPOSITORY                                                                   TAG                 IMAGE ID            CREATED             SIZE
trademate-app                                                                latest              0040fef600ad        8 minutes ago       175MB

Saving a whopping 361MB

Any other improvements would be welcomed...send me an email.

Containerising your Django Application into Docker and eventually Kubernetes

There shift to containers is happening, in some places faster than others...

People underestimate the complexity and all the parts involved in making you applciation work.

The Django Example

In the case of Django, we would in the past (traditionally) deployed it on a webserver running:

  • a Webserver (nginx)
  • a python wsgi - web server gateway interface (gunicorn or uwsgi)
  • a Database (sqlite, mySQL or Postgres)
  • Sendmail
  • Maybe some other stuff: redis for cache and user session

So the server would become a snowflake very quickly as it needs to do multiple things and must be configured to communicate with multiple things.

It violates the single responsibility principle.

But, we did understand it that way. Now there is a bit of a mind shift when docker is brought in.

The key principle is:

Be stateless, kill your servers almost every day

Taken from Node Best Practices

So what does that mean for out Django Application?

Well, we have to think differently. Now for each process we are running we need to decide if it is stateless or stateful.

If it is stateful (not ephemeral) then it should be set aside and run in a traditional manner (or run by a cloud provider). In our case the stateful part is luckily only the database. When a say stateful I mean the state needs to persisit...forever. User session, cache and emails do need to work and persist for shorter time periods - it won't be a total disaster if they fail. User's will just need to reauth.

So all the other parts that can all run on containers are:

  • Nginx
  • Gunicorn
  • Sendmail

For simplicity sake I'm going to gloss over redis as cache and user session. I'm also not that keen to include sendmail because it introduces more complexity and another component - namely message queues.

Lets start Containerising our Django Application

Alright so I'm assuming that you know python and django pretty well and have at least deployed a django app into production (the traditional way).

So we have all the code, we just need to get it runnning in a container - locally.

A good resource to use is ruddra's docker-django repo. You can use some of his Dockerfile examples.

First install docker engine

Let's get it running in docker using just a docker file. Create a file called Dockerfile in the root of the project.


# pull official base image - set the exact version of python
FROM python:3.8.0

LABEL maintainer="Your Name <your@email.com>"

# set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

# Install dependencies
RUN pip install --no-cache-dir -U pip

# Set the user to run the project as, do not run as root
RUN useradd --create-home code
WORKDIR /home/code
USER code

COPY path/to/requirements.txt /tmp/
RUN pip install --user --no-cache-dir -r /tmp/requirements.txt

# Copy Project
COPY . /home/code/

# Documentation from person who built the image to person running the container
EXPOSE 8000

CMD python manage.py runserver 0.0.0.0:8000

I found a cool thing that can audit your Dockerfile - https://www.fromlatest.io

A reference on the Dockerfile commands

Remember to update the settings of the project so that:

ALLOWED_HOSTS = ['127.0.0.1', '0.0.0.0']

Now let us build the image and run it:


docker build . -t company/project
docker run -p 8000:8000 -i --name project -t company/project --name project

Now everthing should just work!...go to: http://0.0.0.0:8000>/code>

 

Pros and Cons of OKD, Portainer and Kubernetes and Docker Swarm Mode Underlying

If you are about to make a choice about which container orchestrator, sheduler or paltform to choose you have a tough decision ahead. It is difficult to understand the nuances, features and limitations of each platform without having tried them out for a while. I am focusing on your own cluster - ie. not using a public cloud provider like Amazon EKS or Azure Container Service.

So in this short post, I'm going to tell you the main issues and features to look out for.

Openshift OKD

Kubernetes optimized for continuous application development and multi-tenant deployment.

Issues / Cons

  • Bad developer experience as containers don't just work - as they cannot run as root and run as a random user - some containers can only run as root.
  • Your containers need to be specifically built for Openshift (sometime)
  • More secure and less prone to vulnerabilities as containers will never run as root user
  • Frontend can be a bit tricky

Features / Pros

  • Like most Red Hat things - great docs and well designed and stable product
  • Good default - sets up a docker registry for you - using kubernetes in a container
  • Everything is managed within OKD - routes are automatically setup - no need to open ports and point stuff all around the place
  • No support for docker-compose as kubernetes has their own way of doing things - you have to convert your docker compose files.
  • Frontend can be nice and simple

Portainer

Making Docker Management Easy - it uses Docker Swarm - no kubernetes.

Issues / Cons

  • No routes (via domain names) - you have to manually configure and manage ports and routes to the containers

Features / Pros

  • Docker developer friendly - containers just work
  • Containers run as root