Category: DevOps

Is there a speed gain when moving from Apache Mod PHP to Nginx PHP-FPM?

I had a chance to deploy one of my running websites on another virtual machine.
I wanted to improve performance as customers are paying for the product and wanted to give a faster experience.

On the old site I used Apache with PHP mod apache to run the site. On the new site I went with Nginx and PHP-FPM.

The Server Setups

Both websites use the Yii Framework on PHP with a MySQL database. There has been some performance tweaks on the Old Site. The new site I left everything standard.

Old Site:

  • 2GB RAM (free 222MB)
  • CPU(s): 2
  • Site shared - vhosts with a few other sites
  • HTTPS Enabled (letencrypt)
    * PRETTY_NAME="Debian GNU/Linux 8 (jessie)"
  • Server hosted in Nederlands (Testing from South Africa)

New Site:

  • 2GB RAM (Free 1161MB)
  • CPU(s): 2
  • Site dedicated, not other site on server
  • No HTTPS
  • PRETTY_NAME="Ubuntu 18.04.4 LTS"
  • Server hosted in South Africa (Testing from South Africa)

Method

The method for the performance test is as follows.

  1. Enable response time logging in the access logs of both apache and nginx - I wrote a post on this with apache and there are docs online for nginx
  2. Browsing Test - I will browse as a non logged in and logged in user on both sites in isolation. The statistics of response times will be recorded from the user's perspective in the browser and from the log response times.
  3. WebPage Test - I will use Web Page Test to compare both sites for a few pages.
  4. Load Test - I will test concurrent load with locustio
  5. Sitespeed.io Test - Test using sitespeed.io open source sitespeed testing utility

This will not be a scientific comparison - purely anecdotal

Browsing Test

PageNginx + PHP-FPM (ms)Apache + ModPHP (ms)Difference
Home Page1380166020%
Contact Us1060131024%
About Us997128028%
Login (POST)14107550435%
Portfolio (Db intentensive)19206960263%
Calculator946131038%
Chart (TTFB)105348231%

From the chart above it is safe to say that without a shadow of a doubt, the new site is faster.

Naturally the server being much closer helps. Instead of 9354km the new server is about 50km away. The average latency on a ping is 187 ms to the old server and about 12ms to the new one.

WebPage Test

I tested both sites from south africa, here are the screenshots and relevant info below:

speed-test-nginx-php-fpm
Speed Test of the New Nginx PHP-FPM website
speed-test-apache-php
Speed Test of the Old Apache ModPHP website
WebPageTest MetricNginx + PHP-FPMApache + ModPHP
First Byte102895
Speed Index7691660
Document Complete Time38503353
Document Complete Requests3633
Fully Loaded Time47464087
Fully Loaded Requests4846

Surprisingly the new website performed worse (in total). It was faster to first byte but full load was worse. Furthermore no caching and webpagetest does not like that.

WebPageTest MetricNginx + PHP-FPMApache + ModPHP
First Byte126913
Speed Index8001681
Document Complete Time68252989
Document Complete Requests1816
Fully Loaded Time68693215
Fully Loaded Requests1917

The results of this were also pretty annoying. It seems that webpagetest wants me to cache static content, gzip assets and use a CDN. Then it will be happy.

Let me add gzip and static caching to nginx and see.
Just uncomment the gzip section in the default nginx.conf.

After adding updaing it is looking a bit better:

add-gzip-and-static-caching-nginx
After updating the new site enabling gzip compression and browser caching

I then removed the twitter feed and things were better:

Old Site:

New Site:

all-a-web-page-test

Load Test

I created a test to make some GET requests against the server - while not logged in. The test has users spam at 1 a second.

The new site performed as follows

number-of-users-nginx-php-fpm
N umber of users nginx php-fpm
response-times-(ms)-php-fpm-nginx
Response times (ms) php-fpm nginx
total-requests-per-second-nginx-php-fpm
Total requests per second nginx php-fpm

So it can run stably from 80 to 100 RPS.

The old site performed terribly. When I got up to 2 RPS all the other sites monitoring was saying it was down. It was weird that the RPS didn't grow according to users as fast with the old site - perhaps locust knows it couldn't handle that spawn rate.

load-test-apapche-mod-php-total-requests

apache-mod-php-load-test-response-time

user-growth-apache-modphp-old-site-loadstest

Sitespeed.io

To do a more comprehensive test I employed sitespeed.io. I then ran the test against both sites are here are the results...

The Old Mod-PHP and Apache site

sitespeed-io-for-old-apache-mod-php-site

The New PHP-FPM and Nginx site

sitespeed-io-for-new-nginx-php-fpm-site

Conclusion

Some tests were conclustive - others were still in the balance.
From a load testing and user initial response view - the new site clearly wins. The biggest gain comes more from concurrent users and handling load. Another significant bit was moving the server closer to the users.

The PHP-FPM with Nginx site can handle 40 or more times more load than the other site and has a faster response even with the 200ms handicap.

Next Steps

The next steps to take would be to look at how to maximise performance with nginx and php-fpm

Containerising your Django Application into Docker and eventually Kubernetes

There shift to containers is happening, in some places faster than others...

People underestimate the complexity and all the parts involved in making you applciation work.

The Django Example

In the case of Django, we would in the past (traditionally) deployed it on a webserver running:

  • a Webserver (nginx)
  • a python wsgi - web server gateway interface (gunicorn or uwsgi)
  • a Database (sqlite, mySQL or Postgres)
  • Sendmail
  • Maybe some other stuff: redis for cache and user session

So the server would become a snowflake very quickly as it needs to do multiple things and must be configured to communicate with multiple things.

It violates the single responsibility principle.

But, we did understand it that way. Now there is a bit of a mind shift when docker is brought in.

The key principle is:

Be stateless, kill your servers almost every day

Taken from Node Best Practices

So what does that mean for out Django Application?

Well, we have to think differently. Now for each process we are running we need to decide if it is stateless or stateful.

If it is stateful (not ephemeral) then it should be set aside and run in a traditional manner (or run by a cloud provider). In our case the stateful part is luckily only the database. When a say stateful I mean the state needs to persisit...forever. User session, cache and emails do need to work and persist for shorter time periods - it won't be a total disaster if they fail. User's will just need to reauth.

So all the other parts that can all run on containers are:

  • Nginx
  • Gunicorn
  • Sendmail

For simplicity sake I'm going to gloss over redis as cache and user session. I'm also not that keen to include sendmail because it introduces more complexity and another component - namely message queues.

Lets start Containerising our Django Application

Alright so I'm assuming that you know python and django pretty well and have at least deployed a django app into production (the traditional way).

So we have all the code, we just need to get it runnning in a container - locally.

A good resource to use is ruddra's docker-django repo. You can use some of his Dockerfile examples.

First install docker engine

Let's get it running in docker using just a docker file. Create a file called Dockerfile in the root of the project.


# pull official base image - set the exact version of python
FROM python:3.8.0

LABEL maintainer="Your Name <your@email.com>"

# set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

# Install dependencies
RUN pip install --no-cache-dir -U pip

# Set the user to run the project as, do not run as root
RUN useradd --create-home code
WORKDIR /home/code
USER code

COPY path/to/requirements.txt /tmp/
RUN pip install --user --no-cache-dir -r /tmp/requirements.txt

# Copy Project
COPY . /home/code/

# Documentation from person who built the image to person running the container
EXPOSE 8000

CMD python manage.py runserver 0.0.0.0:8000

I found a cool thing that can audit your Dockerfile - https://www.fromlatest.io

A reference on the Dockerfile commands

Remember to update the settings of the project so that:

ALLOWED_HOSTS = ['127.0.0.1', '0.0.0.0']

Now let us build the image and run it:


docker build . -t company/project
docker run -p 8000:8000 -i --name project -t company/project --name project

Now everthing should just work!...go to: http://0.0.0.0:8000>/code>

 

Help me Understand Containerisation (Docker) – Part 1

I was looking at nodeJS servers, how they work and ways to scale them. I came across a best practices repo and it said: "Be stateless, kill your Servers almost every day".

I've always been skeptical of docker and containers - granted I haven't used them extensively but I've always thought of them as needless abstractions and added complexity that make the developer's and system administrator's lives harder.

The tides have pushed me towards it though...so I asked a friend to Help me understand containerisation.

Help Me Understand Containerisation

Going back to the article I was reading it said: Many successful products treat servers like a phoenix bird - it dies and is reborn periodically without any damage. This allows adding and removing servers dynamically without any side-effects.

Some things to avoid with this phoenix stategy is:

  • Saving uploaded files locally on the server
  • Storing authenticated sessions in local file or memory
  • Storing information in a global object

Q: Do you have to store any type of data (e.g. user sessions, cache, uploaded files) within external data stores?

Q: So the container part is only application code running…no persistance?

A: in general i find it best to keep persistence out of the container. There's things you can do .. but generally it's not a great idea

Q: Does it / is it supposed to make your life easier?

A: yeah. I don't do anything without containers these days

Q: Containers and container management are 2 seperate topics?

A: yeah. I don't do anything without containers these days

I think the easiest way to think of a container is like an .exe (that packages the operating system and all requirements). So once a container is built, it can run anywhere that containers run.

Q: Except the db and session persistance is external

A: doesn't have to be .. but it is another moving part

A: the quickest easiest benefit of containers is to use them for dev. (From a python perspective..) e.g.: I don't use venv anymore, cause everything is in a container

so .. on dev, I have externalized my db, but you don't really need to do that

Q: Alright but another argument is the scaling one…so when black friday comes you can simply deploy more containers and have them load balanced. but what is doing the load balancing?

A: yeah .. that's a little more complicated though .. and you'd be looking at k8s for that. (For loadbalancing) Usually that's k8s (swarm in my case) though .. that's going to depend a lot on your setup.E.g.: I just have kong replicated with spread (so it goes on every machine in the swarm)

Q: If you are not using a cloud provider and want this on a cluster of vm’s - is that hard or easy?

A: Setting up and managing k8s is not easy. If you want to go this way (which is probably the right answer), I would strongly recommend using a managed solution. DO (DigitalOcean) have a nice managed solution -which is still just VMs at the end of the day.

Q: What is spread?

A: Spread is a replication technique for swarm - so it will spread the app onto a container on every node in the swarm. Swarm is much easier to get your head around - but it's a dying tech. K8s has def won that battle... but I reckon the first thing you should do is get comfortable with containers for dev.

Q: Everything easy and simple has to die

A: In fairness it's nowhere near as good as k8s.

Q: ah looks like https://www.okd.io/ would be the thing to use in this case

A: yeah .. there's a bunch of things out there. Okd looks lower level than I would like to go 😉

Q (Comment): Yip, I mean someone else manage the OKD thing…and I can just use it as if it were a managed service

In my head the docker process would involve the following steps:

  1. Container dev workflow
  2. Container ci workflow
  3. Container deployment
  4. Container scaling

A note on the Container Service Model

So what Containers really are are somewhere in between Infrastructure as a Service (Iaas) and Platform as a Service (Paas). So you are abstracted away from the VM but still manage programming languages, libraries on the container.

It has given rise to Container as a Service.

container-as-service
Credit: Oreilly Openshift for Developers

The Container Development Process

A: ok .. you can probably actually setup a blank project and we can get a base django setup up and running if you like?

Steps

Make a Directory and add a Dockerfile in it. What is a Dockerfile? A file containing all the commands (on commandline) to build a docker image.

The docker build command builds an image from a Dockerfile and a context.

More info on building your docker file in the Dockerfile reference

The contents of the Dockerfile should be (There are other Dockerfile examples available caktusgroup, dev.to, testdriven.io


# pull official base image
FROM python:3.7

LABEL maintainer="Name <stephen@example.com>"
# set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
RUN mkdir -p /code
WORKDIR /code

# Install dependencies
COPY requirements.txt /code/
RUN pip install -U pip
RUN pip install --no-cache-dir -r requirements.txt

# Copy Project
COPY . /code/

EXPOSE 8000

For this to work, we need the requirements.txt to be present. So for now just put in django.

While looking at different DockerFiles I found these best practices for using docker. It seems like the base image is important and people recommend small ones. There is another article commenting that alpine is not the way to go. For now it is not worth wasting time and you don't need to worry about the different image variants.

PYTHONDONTWRITEBYTECODE: Prevents Python from writing pyc files to disc (equivalent to python -B option)

PYTHONUNBUFFERED: Prevents Python from buffering stdout and stderr (equivalent to python -u option)

Then create a docker-compose.yaml in the same directory. Docker compose is a tool for defining and running mulitple container docker applications. A way to link docker containers together. Compose works in all environments: production, staging, development, testing, as well as CI workflows.

So we have created a docker image for our django application code, but now we need to combine that with a database or just set some other settings. Take a look at the compose file reference.

Add the following:


version: '3.7'

services:
  web:
    build: .
    command: python manage.py runserver 0.0.0.0:8000
    volumes:
      - .:/code
    ports:
      - "8000:8000"

This part:


volumes:
  - .:/code

Maps your current directory on your machine (host) to /code inside the container. You do that in dev - not in production.

Mapping your local volume to a volume in the container means that changes made locally will cause the server to reload on the docker container.

In docker compose remember to list your services in the order you expect them to start.

If you don't have an existing docker project in your directory then run:

docker-compose run --rm web django-admin.py startproject {projectname} .

You want to get in the habit of initialising projects in the container

Next get it up and running with:

docker-compose up

It then should run in the foreground, the same way a django project runs. You can stop it with ctrl + c.

A note on making changes to the Dockerfile or requirements.txt

docker-build-process

So next time you run docker build it will be faster or even skip some things. However if you add a new dependency to requirements you must manually
docker compose build

It is important to commit your Dockerfile and docker-compose.yaml to git.

docker-change-python-version set-python-docker-imageAdding Postgres

To add a postgres db, add the following to docker-compose.yaml:


  db:
    image: postgres:10.10

Now do a docker-compose up again. This will add the postgres db docker container without persistence.

The command will run the build for you, but in this case it is just pulling an image.

Running in the Background

You can keep the containers running the background with using detached mode:

docker-compose up -d

then you can tail the logs of both containers with:

docker-compose logs -f

if you want to tail the logs of just one container:

docker-compose logs -f db

so once that's up, your database will be available with the alias db (the name of the service), similar to a hostname. It is very handy, cause you can keep that consistent throughout your environments.

Your containers might look like this:


$ docker container list
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                    NAMES
3db1218ca0ae        postgres:10.10      "docker-entrypoint.s…"   8 minutes ago       Up 4 seconds        5432/tcp                 vm-api_db_1
14e26e524ded        vm-api_web          "python manage.py ru…"   19 hours ago        Up 4 seconds        0.0.0.0:8000->8000/tcp   vm-api_web_1

If you look at the ports, the database is not exposing a port because docker-compose will manage the networking between the containers.

An import concept to understand is: Everything inside your compose file is isolated, and you have to explicitly expose ports. You'll get a warning if you try run a service on a port that is already exposed.

Connecting to the DB

To connect to the db, update your settings file with:


DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'NAME': os.environ.get('DATABASE_NAME', 'postgres'),
        'USER': os.environ.get('DATABASE_USER', 'postgres'),
        'HOST': os.environ.get('DATABASE_HOST', 'db'),
        'PORT': 5432,
    },

You will also need to add psycopg2 2 to requirements.txt.

You can create a seperate settings file and then set the environment variables. I think it feels better to put environment variables in docker compose:

  web:
    environment:
      - DJANGO_SETTINGS_MODULE=my_project.settings.docker

If you wanted to do it in the image:

ENV DJANGO_SETTINGS_MODULE=my_project.settings.deploy

You can run docker-compose build again.

Remember adding a new package in requirements.txt needs a new build

ok so docker-compose up for changes to docker-compose.yaml and docker-compose build for changes to Dockerfile

docker-compose restart is for when you make changes to your compose file.

Test that Environment Vars are Working


$ web python manage.py shell
Python 3.7.4 (default, Sep 11 2019, 08:25:59) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> import os
>>> os.environ.get('DJANGO_SETTINGS_MODULE')
'window.settings.docker_local'

 

Adding Persistence to the DB

Adding persistence to a db is done by adding a volume:


 db:
    image: postgres:9.5
    volumes:
      - data-volume:/var/lib/db

Then add a top level volume declaration:

An entry under the top-level volumes key can be empty, in which case it uses the default driver configured by the Engine (in most cases, this is the local driver)


volumes:
  data-volume:

That will do this:

Creating volume "vm-api_data-volume" with default driver

Nuclear Option

Sometimes you just need to blow things up and start again:


docker-compose stop

docker-compose rm -f -v

docker-compose up -d

Ordering

It is advised to list your services in the order you expect them to start in docker-compose.

You can also add depends_on: db but it isn't completely reliable.

Running Commands in your dev container

There are commands like migrating and creating a superuser that need to be run in your container.

You can do that with:t

docker-compose run --rm web ./manage.py migrate

A great alias to add to your (control node) host is:

alias web='docker-compose run --rm web'

Refresh your shell:exec $SHELL

Then you can do: web ./manage.py createsuperuser

remove-container-when-not-using

Debugging in your Development Docker Container

attack-a-python-debugger-docker-container

Well this has gone on long enough...catch the containerisation of a CI workflow and Deloyment of containers in Part 2 of Help me Understand Containerisation