Month: August 2018

How to get data from tables in mysql database into a postgres schema?

I've had to do a bit of learning and tinkering to figure out how to do this. I now understand why ETL tools and people that do data migrations and translations are needed. I even will allow use of proprietary GUI based ETL tools.

Anyway the backstory is I created a django project initially for MySQL but needed to change to Postgres later on. I had been using a mysql instance and wanted to migrate the data to the new postgres db.

Converting to Postgres from MySQL

The best tool to use, in my opinion, for a striaght conversion of a source db to postgres is pg_loader.

On the source server I had to install postgres for ease of not having to setup a remote connection and then it is a case of:

sudo apt install pg_loader

Then create the target db in postgres and migrate

$ createdb f1db
$ pgloader mysql://root@localhost/f1db pgsql:///f1db

So we now have a postgres DB which would be suitable for most people.

But I already had a new db instance running and needed to import the data.

Dumping the Postgres DB

An important thing to remember here is that the db already running has a publicschema. This db also has a public schema. Another thing is moving data between databases is  difficult because with postgres you can only connect to one database at a time.

So we should rename the schema in the db we have just converted to something else.

alter schema public rename to original_public;
create schema public;

Then we can import / restore it into the same database we have already running and can copy data between the schemas easily.

Dumping the DB

Use pg_dump to  export just the new schema:

pg_dump -F c -b -v -f --schema=original_public mydb.backup mydb

Restoring the DB

We need to transport the dump to the new server with scp (not in this tutorial).

Once it is accessible, restore the table into the existing db:

pg_restore --dbname=instance_db mydb.backup

Now we have both schemas in the same database.

Moving data between tables

If the tables have the same structure we can easily copy data from one schema to another with a query.

insert into public.the_table select * from original_public.the_table;

and that is that.

Serial Sequencing

There was however an issue when it came to the auto increment fields in my case just primary keys. They were not set at the correct value and the pp broke because of this.

So I manually went and check which number it should be on and updated the relevant fields because I did not have the knowledge.

The knowledge I am referring to is a management command that does this for you called sqlsequencereset

This would have saved me 10 minutes.

Deploying a django app with dedicated web and db servers

One of the many architectural decisions that will start to impact you when you get to a level where you need to scale is splitting you db and app. Typically we start on a budget and have to share resources but ideally you want to start out separate. The reasons is that the db server will know exactly how much RAM is available to it at all times and will hence improve the consistency and reliability.

Provision 2 Servers

To start off provision 2 (ubuntu) servers, to label things give each a fully qualified domain name like and

Then do a basic security and authentication setup on both servers.

The App Server

To setup the app server you can use this guide which uses python 3.6, Nginx, gunicorn and mysql. Just skip the database setup part.

The Database Server

Install postgres.

We need a role (user) for the database and because this role will be adding extensions it needs to be a superuser.


Importantly we need to look at django's optimal postgres config

ALTER ROLE dbuser SET client_encoding TO 'utf8';
ALTER ROLE dbuser SET default_transaction_isolation TO 'read committed';
ALTER ROLE dbuser SET timezone TO 'UTC';

Then create the database:

CREATE DATABASE myproject; now fill out the DATABASES setting in your application and make sure the HOST is the internal ip as the servers are within the same data-center hopefully.

But we will need to configure postgres to allow and listen for connections from the internal network. We don't want public ip's to have access to it only our other app server within the same datacentre. I've done this with MySQL but forgot how to it, so I'm searching how to do it now.

First thing is setup the uncomplicated firewall with:

sudo ufw enable
sudo ufw allow OpenSSH
sudo ufw status

Now we want to enable connections from our app server:

sudo ufw allow from app_server_internal_ip_address to any port 5432

Log into psql and set it to listen on all ip's:

ALTER SYSTEM SET listen_addresses = '*';

then reload the server:

SELECT pg_reload_conf();

Check where your pg_hba.conf is with:

SELECT name, setting FROM pg_settings WHERE category = 'File Locations';

then add the following line:

# IPv4 local connections:
host    all             all               md5


sudo systemctl restart postgresql

Test with the postgres client on the app server:

sudo apt install postgresql-client

There are a few performance tweaks you can do, but I'm always inclined to leave it standard before doing that.

Allow remote connections to PostgreSQL

Functional Tests with Django Tenant Schemas and Selenium

On my journey of adding multi-tenancy to my app, I have used django-tenant-schemas and the testing utilities it provides to make my unit tests work with the tenants or main site.

However the functional tests is another kettle of fish as there is not utility to use for them.

Functional Tests

From the django docs on the topic:

Use in-browser frameworks like Selenium to test rendered HTML and the behavior of Web pages, namely JavaScript functionality. Django also provides special support for those frameworks; see the section on LiveServerTestCase for more details.

LiveServerTestCase is the same as a TransactionTestCase with one added feature in that it launches a live django server in the background for browser based tests and simulations with selenium.

Good to know that the TransactionTestCase differs from a normal TestCase in the way the db is reset after tests. TransactionTestCase will truncate the tables, whereas TestCase will just rollback the transaction.

The live server listens on localhost and binds to port 0 which uses a free port assigned by the operating system. The server’s URL can be accessed with self.live_server_url during the tests.

If you are using the staticfiles app then you need to use StaticLiveServerTestCase to ensure the static files are loaded.

Now what are the issues that I can foresee:

  1. There is a public facing site (with PUBLIC_URLS) and tenant sites that run on different domains. I would need to test both sites with selenium.
  2. The test tenant is not created anymore and I will have to subclass the StaticLiveServerTestsCase to create the test tenant (and public tenant that uses the public schema)

Other than that all is good.

The Public Schema

Functional tests should simulate reality as much as possible. So I want there to be a public schema and at least 2 tenants (Update: nope I don't need to do this, as this should be tested by django_tenant_schemas itself). So in creating the base functional test case, these 3 things need to be created.

The first thing to do is get the setUp of the functional test right. Now regular setup of the app needs to be linked with a tenant db and not public as the public schema will only have the SHARED_APPS and not the TENANT_APPS so we  need to separate it out.

So I started on the public test case, I took everything to its base bones and made a simple request:

self.browser.get(f'{ self.live_server_url }')

Which gave a bad request error. Usually when you get a 400 it is because your public schema is not there. So you will need to create your TenantModel Record and ensure the domain is localhost.


class PublicFunctionalTest(BaseFunctionalTest):
    '''Functional test just for the public facing site'''
    def setUp(self):
        '''Create the public schema'''

This worked well and I was sent to the public website, made easier due to the default of localhost by the TestCase. Now you can add all your public facing functional tests.

Creating the Tenant Schema

Creating the public schema in the public facing functional test was straightforward because the server in the background goes to localhost and that is the domain_url of the schema we setup.

Now to makes things as simple as possible, I just realised you can set the tenant schema to use localhost and everything should work similarly. There are however a few other things that seem to need to be done based on the provided TenantTestCase which I've pasted here:


from django.conf import settings
from import call_command
from django.db import connection
from django.test import TestCase
from tenant_schemas.utils import get_public_schema_name, get_tenant_model


class TenantTestCase(TestCase):
    def add_allowed_test_domain(cls):
        # ALLOWED_HOSTS is a special setting of Django setup_test_environment so we can't modify it with helpers
        if ALLOWED_TEST_DOMAIN not in settings.ALLOWED_HOSTS:
            settings.ALLOWED_HOSTS += [ALLOWED_TEST_DOMAIN]

    def remove_allowed_test_domain(cls):

    def setUpClass(cls):
        tenant_domain = ''
        cls.tenant = get_tenant_model()(domain_url=tenant_domain, schema_name='test')  # todo: is there any way to get the verbosity from the test command here?


    def tearDownClass(cls):

        cursor = connection.cursor()
        cursor.execute('DROP SCHEMA IF EXISTS test CASCADE')

    def sync_shared(cls):

class FastTenantTestCase(TenantTestCase):
    def setUpClass(cls):
        tenant_domain = ''

        TenantModel = get_tenant_model()
            cls.tenant = TenantModel.objects.get(domain_url=tenant_domain, schema_name='test')
            cls.tenant = TenantModel(domain_url=tenant_domain, schema_name='test')


    def tearDownClass(cls):

There are quite a few things being done here, one thing that I don't think will work is setting the domain_url of the tenant to which would require a modification to the host file.

We would need things like set_tenant  . So we need to decide if it is better to inherit from TenantTestCase and add the functionality of the StaticLiveServer or the other way around. Or perhaps use Multiple Inheritance.

Wait I think I've found an example where a Mixin would be the appropriate choice. So I pretty much copied the TenantTestCase and extended from StaticLiveServerTestCase instead of TestCase.

class StaticLiveServerTenantTestCase(StaticLiveServerTestCase):

However I have a wrapper of the StaticLiveServerTestCase that adds functionality from harry percival's obey the testing goat book that takes screenshots, waits for elements to appear etc.

Now I don't want to duplicate these methods just applied to different classes because they extend from different classes.