Category: Continuous Integration

Help Me Understand Containerisation (Docker) – Part 2

This is part 2, I’d recommend checking out Part 1 of Help Me Understand Containerisation (Docker)

We have gone over the development workflow – in a django context – and it was quite a hefty chunk. Now let us look at continuous integration with gitlab and containers.

The Container CI Workflow

We are going to use gitlab for Continuous Integration. Gitlab is closely aligned with a container based development workflow.

jenkins-analog-player-in-containerised-world

A: I use gitlab for my ci .. so that part will depend on gitlab ..

A: but containers do make that easy .. cause you just need to build the container .. and then you can do everything inside the container

gitlab-ci-containerisation

Let’s dive into container based CI with gitlab:

continuous-integration-discussion-docker

For Gitlab pipelines you should read the docs on How Gitlab CI/CD works? and basic ci/cd workflow.

The New Trend in Continuous Integration

  1. Create an application image.
  2. Run tests against the created image.
  3. Push image to a remote registry.
  4. Deploy to a server from the pushed image.

 

Importantly you will need to enable the container registry if you have a private gitlab instance, you can do that with the following admin documentation for enabling the container registry.

Step 1 is to make a .gitlab-ci.yml file.

 

The Container Deployment

Where should you DB and session storage live?

What is the backup strategy?

Lets do some autoscaling.

Docker Compose docs for production gives some good tips on deploying to production.

 

Help me Understand Containerisation (Docker) – Part 1

I was looking at nodeJS servers, how they work and ways to scale them. I came across a best practices repo and it said: “Be stateless, kill your Servers almost every day“.

I’ve always been skeptical of docker and containers – granted I haven’t used them extensively but I’ve always thought of them as needless abstractions and added complexity that make the developer’s and system administrator’s lives harder.

The tides have pushed me towards it though…so I asked a friend to Help me understand containerisation.

Help Me Understand Containerisation

Going back to the article I was reading it said: Many successful products treat servers like a phoenix bird – it dies and is reborn periodically without any damage. This allows adding and removing servers dynamically without any side-effects.

Some things to avoid with this phoenix stategy is:

  • Saving uploaded files locally on the server
  • Storing authenticated sessions in local file or memory
  • Storing information in a global object

Q: Do you have to store any type of data (e.g. user sessions, cache, uploaded files) within external data stores?

Q: So the container part is only application code running…no persistance?

A: in general i find it best to keep persistence out of the container. There’s things you can do .. but generally it’s not a great idea

Q: Does it / is it supposed to make your life easier?

A: yeah. I don’t do anything without containers these days

Q: Containers and container management are 2 seperate topics?

A: yeah. I don’t do anything without containers these days

I think the easiest way to think of a container is like an .exe (that packages the operating system and all requirements). So once a container is built, it can run anywhere that containers run.

Q: Except the db and session persistance is external

A: doesn’t have to be .. but it is another moving part

A: the quickest easiest benefit of containers is to use them for dev. (From a python perspective..) e.g.: I don’t use venv anymore, cause everything is in a container

so .. on dev, I have externalized my db, but you don’t really need to do that

Q: Alright but another argument is the scaling one…so when black friday comes you can simply deploy more containers and have them load balanced. but what is doing the load balancing?

A: yeah .. that’s a little more complicated though .. and you’d be looking at k8s for that. (For loadbalancing) Usually that’s k8s (swarm in my case) though .. that’s going to depend a lot on your setup.E.g.: I just have kong replicated with spread (so it goes on every machine in the swarm)

Q: If you are not using a cloud provider and want this on a cluster of vm’s – is that hard or easy?

A: Setting up and managing k8s is not easy. If you want to go this way (which is probably the right answer), I would strongly recommend using a managed solution. DO (DigitalOcean) have a nice managed solution -which is still just VMs at the end of the day.

Q: What is spread?

A: Spread is a replication technique for swarm – so it will spread the app onto a container on every node in the swarm. Swarm is much easier to get your head around – but it’s a dying tech. K8s has def won that battle… but I reckon the first thing you should do is get comfortable with containers for dev.

Q: Everything easy and simple has to die

A: In fairness it’s nowhere near as good as k8s.

Q: ah looks like https://www.okd.io/ would be the thing to use in this case

A: yeah .. there’s a bunch of things out there. Okd looks lower level than I would like to go 😉

Q (Comment): Yip, I mean someone else manage the OKD thing…and I can just use it as if it were a managed service

In my head the docker process would involve the following steps:

  1. Container dev workflow
  2. Container ci workflow
  3. Container deployment
  4. Container scaling

The Container Development Process

A: ok .. you can probably actually setup a blank project and we can get a base django setup up and running if you like?

Steps

Make a Directory and add a Dockerfile in it. What is a Dockerfile? A file containing all the commands (on commandline) to build a docker image.

The docker build command builds an image from a Dockerfile and a context.

More info on building your docker file in the Dockerfile reference

The contents of the Dockerfile should be (There are other Dockerfile examples available caktusgroup, dev.to, testdriven.io


# pull official base image
FROM python:3.7

LABEL maintainer="Name <stephen@example.com>"
# set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
RUN mkdir -p /code
WORKDIR /code

# Install dependencies
COPY requirements.txt /code/
RUN pip install -U pip
RUN pip install --no-cache-dir -r requirements.txt

# Copy Project
COPY . /code/

EXPOSE 8000

For this to work, we need the requirements.txt to be present. So for now just put in django.

While looking at different DockerFiles I found these best practices for using docker. It seems like the base image is important and people recommend small ones. There is another article commenting that alpine is not the way to go. For now it is not worth wasting time and you don’t need to worry about the different image variants.

PYTHONDONTWRITEBYTECODE: Prevents Python from writing pyc files to disc (equivalent to python -B option)

PYTHONUNBUFFERED: Prevents Python from buffering stdout and stderr (equivalent to python -u option)

Then create a docker-compose.yaml in the same directory. Docker compose is a tool for defining and running mulitple container docker applications. A way to link docker containers together. Compose works in all environments: production, staging, development, testing, as well as CI workflows.

So we have created a docker image for our django application code, but now we need to combine that with a database or just set some other settings. Take a look at the compose file reference.

Add the following:


version: '3.7'

services:
  web:
    build: .
    command: python manage.py runserver 0.0.0.0:8000
    volumes:
      - .:/code
    ports:
      - "8000:8000"

This part:


volumes:
  - .:/code

Maps your current directory on your machine (host) to /code inside the container. You do that in dev – not in production.

Mapping your local volume to a volume in the container means that changes made locally will cause the server to reload on the docker container.

In docker compose remember to list your services in the order you expect them to start.

If you don’t have an existing docker project in your directory then run:

docker-compose run --rm web django-admin.py startproject {projectname} .

You want to get in the habit of initialising projects in the container

Next get it up and running with:

docker-compose up

It then should run in the foreground, the same way a django project runs. You can stop it with ctrl + c.

A note on making changes to the Dockerfile or requirements.txt

docker-build-process

So next time you run docker build it will be faster or even skip some things. However if you add a new dependency to requirements you must manually
docker compose build

It is important to commit your Dockerfile and docker-compose.yaml to git.

docker-change-python-version set-python-docker-imageAdding Postgres

To add a postgres db, add the following to docker-compose.yaml:


  db:
    image: postgres:10.10

Now do a docker-compose up again. This will add the postgres db docker container without persistence.

The command will run the build for you, but in this case it is just pulling an image.

Running in the Background

You can keep the containers running the background with using detached mode:

docker-compose up -d

then you can tail the logs of both containers with:

docker-compose logs -f

if you want to tail the logs of just one container:

docker-compose logs -f db

so once that’s up, your database will be available with the alias db (the name of the service), similar to a hostname. It is very handy, cause you can keep that consistent throughout your environments.

Your containers might look like this:


$ docker container list
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                    NAMES
3db1218ca0ae        postgres:10.10      "docker-entrypoint.s…"   8 minutes ago       Up 4 seconds        5432/tcp                 vm-api_db_1
14e26e524ded        vm-api_web          "python manage.py ru…"   19 hours ago        Up 4 seconds        0.0.0.0:8000->8000/tcp   vm-api_web_1

If you look at the ports, the database is not exposing a port because docker-compose will manage the networking between the containers.

An import concept to understand is: Everything inside your compose file is isolated, and you have to explicitly expose ports. You’ll get a warning if you try run a service on a port that is already exposed.

Connecting to the DB

To connect to the db, update your settings file with:


DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'NAME': os.environ.get('DATABASE_NAME', 'postgres'),
        'USER': os.environ.get('DATABASE_USER', 'postgres'),
        'HOST': os.environ.get('DATABASE_HOST', 'db'),
        'PORT': 5432,
    },

You will also need to add psycopg2 2 to requirements.txt.

You can create a seperate settings file and then set the environment variables. I think it feels better to put environment variables in docker compose:

  web:
    environment:
      - DJANGO_SETTINGS_MODULE=my_project.settings.docker

If you wanted to do it in the image:

ENV DJANGO_SETTINGS_MODULE=my_project.settings.deploy

You can run docker-compose build again.

Remember adding a new package in requirements.txt needs a new build

ok so docker-compose up for changes to docker-compose.yaml and docker-compose build for changes to Dockerfile

docker-compose restart is for when you make changes to your compose file.

Test that Environment Vars are Working


$ web python manage.py shell
Python 3.7.4 (default, Sep 11 2019, 08:25:59) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> import os
>>> os.environ.get('DJANGO_SETTINGS_MODULE')
'window.settings.docker_local'

 

Adding Persistence to the DB

Adding persistence to a db is done by adding a volume:


 db:
    image: postgres:9.5
    volumes:
      - data-volume:/var/lib/db

Then add a top level volume declaration:

An entry under the top-level volumes key can be empty, in which case it uses the default driver configured by the Engine (in most cases, this is the local driver)


volumes:
  data-volume:

That will do this:

Creating volume "vm-api_data-volume" with default driver

Nuclear Option

Sometimes you just need to blow things up and start again:


docker-compose stop

docker-compose rm -f -v

docker-compose up -d

Ordering

It is advised to list your services in the order you expect them to start in docker-compose.

You can also add depends_on: db but it isn’t completely reliable.

Running Commands in your dev container

There are commands like migrating and creating a superuser that need to be run in your container.

You can do that with:t

docker-compose run --rm web ./manage.py migrate

A great alias to add to your (control node) host is:

alias web='docker-compose run --rm web'

Refresh your shell:exec $SHELL

Then you can do: web ./manage.py createsuperuser

remove-container-when-not-using

Debugging in your Development Docker Container

attack-a-python-debugger-docker-container

Well this has gone on long enough…catch the containerisation of a CI workflow and Deloyment of containers in Part 2 of Help me Understand Containerisation

 

Getting Jenkins to deploy with ansible using SSH Agent Forwarding

Your CI/CD tool needs access to code and server, for linting, testing and deploying.
Setup up access on the various devices in a secure manner can be very time consuming. It is important to make use of available technology to make our lives easier.

Jenkins needs access

You will have created credentials for Jenkins – by creating a SSH key pair for your jenkins user. Ensure that that public key have access to the code on your version control platform (as a deploy key).

Now jenkins will be able to get your code and run various tests on it. The problem now is deployment.

jenkins-credentials

Use Jenkins’s SSH credentials to Deploy

We are using ansible to deploy (from the jenkins box). So now jenkins needs access to wherever you are deploying the code to. You would do an ssh-copy-id to get it there.

But there is another problem, when ansible runs the git module to clone the repo you will get an error that the user does not have access.

Correct, the user on the box you are deploying to does not have access to the code. Now you could add that box as another deploy key but now when scaling out to many boxes you will have a hell of alot of ssh credentials to manage.

The best thing to do is use the jenkins user’s cerentials that log into your target box to get the code. This is done with SSH Agent forwarding.

The first thing we will need is the  jenkins SSH agent plugin.

Then enable the SSH agent for your job:

enable-jenkins-ssh-agent

 

Then install the Jenkins ansible plugin and configure it.

Finally you need to tell ansible to use SSH Agent forwarding, otherwise it just won’t do it:


mkdir /etc/ansible
vi /etc/ansiible/ansible.cfg

Add the following config there:


defaults]
host_key_checking = False

[ssh_connection]
ssh_args = -o ForwardAgent=yes -o ControlMaster=auto -o ControlPersist=60s

Of course, it is better to ensure host_key_checking is done.

Now everything should work.

Source: SSH Agent Forwarding with Ansible