Category: Database

How to get data from tables in mysql database into a postgres schema?

I’ve had to do a bit of learning and tinkering to figure out how to do this. I now understand why ETL tools and people that do data migrations and translations are needed. I even will allow use of proprietary GUI based ETL tools.

Anyway the backstory is I created a django project initially for MySQL but needed to change to Postgres later on. I had been using a mysql instance and wanted to migrate the data to the new postgres db.

Converting to Postgres from MySQL

The best tool to use, in my opinion, for a striaght conversion of a source db to postgres is pg_loader.

On the source server I had to install postgres for ease of not having to setup a remote connection and then it is a case of:

sudo apt install pg_loader

Then create the target db in postgres and migrate


$ createdb f1db
$ pgloader mysql://root@localhost/f1db pgsql:///f1db

So we now have a postgres DB which would be suitable for most people.

But I already had a new db instance running and needed to import the data.

Dumping the Postgres DB

An important thing to remember here is that the db already running has a publicschema. This db also has a public schema. Another thing is moving data between databases is  difficult because with postgres you can only connect to one database at a time.

So we should rename the schema in the db we have just converted to something else.


alter schema public rename to original_public;
create schema public;

Then we can import / restore it into the same database we have already running and can copy data between the schemas easily.

Dumping the DB

Use pg_dump to  export just the new schema:


pg_dump -F c -b -v -f --schema=original_public mydb.backup mydb

Restoring the DB

We need to transport the dump to the new server with scp (not in this tutorial).

Once it is accessible, restore the table into the existing db:

pg_restore --dbname=instance_db mydb.backup

Now we have both schemas in the same database.

Moving data between tables

If the tables have the same structure we can easily copy data from one schema to another with a query.


psql
insert into public.the_table select * from original_public.the_table;

and that is that.

Serial Sequencing

There was however an issue when it came to the auto increment fields in my case just primary keys. They were not set at the correct value and the pp broke because of this.

So I manually went and check which number it should be on and updated the relevant fields because I did not have the knowledge.

The knowledge I am referring to is a management command that does this for you called sqlsequencereset

This would have saved me 10 minutes.

Deploying a django app with dedicated web and db servers

One of the many architectural decisions that will start to impact you when you get to a level where you need to scale is splitting you db and app. Typically we start on a budget and have to share resources but ideally you want to start out separate. The reasons is that the db server will know exactly how much RAM is available to it at all times and will hence improve the consistency and reliability.

Provision 2 Servers

To start off provision 2 (ubuntu) servers, to label things give each a fully qualified domain name like web.myserver.com and db.myserver.com

Then do a basic security and authentication setup on both servers.

The App Server

To setup the app server you can use this guide which uses python 3.6, Nginx, gunicorn and mysql. Just skip the database setup part.

The Database Server

Install postgres.

We need a role (user) for the database and because this role will be adding extensions it needs to be a superuser.

CREATE ROLE dbuser LOGIN PASSWORD 'mydbpass' SUPERUSER;

Importantly we need to look at django’s optimal postgres config

ALTER ROLE dbuser SET client_encoding TO 'utf8';
ALTER ROLE dbuser SET default_transaction_isolation TO 'read committed';
ALTER ROLE dbuser SET timezone TO 'UTC';

Then create the database:

CREATE DATABASE myproject;

Ok…so now fill out the DATABASES setting in your application and make sure the HOST is the internal ip as the servers are within the same data-center hopefully.

But we will need to configure postgres to allow and listen for connections from the internal network. We don’t want public ip’s to have access to it only our other app server within the same datacentre. I’ve done this with MySQL but forgot how to it, so I’m searching how to do it now.

First thing is setup the uncomplicated firewall with:


sudo ufw enable
sudo ufw allow OpenSSH
sudo ufw status

Now we want to enable connections from our app server:

sudo ufw allow from app_server_internal_ip_address to any port 5432

Log into psql and set it to listen on all ip’s:

ALTER SYSTEM SET listen_addresses = '*';

then reload the server:

SELECT pg_reload_conf();

Check where your pg_hba.conf is with:

SELECT name, setting FROM pg_settings WHERE category = 'File Locations';

then add the following line:


# IPv4 local connections:
host    all             all             10.0.0.4/32            md5

Restart


sudo systemctl restart postgresql

Test with the postgres client on the app server:

sudo apt install postgresql-client

There are a few performance tweaks you can do, but I’m always inclined to leave it standard before doing that.

https://www.digitalocean.com/community/tutorials/how-to-secure-postgresql-against-automated-attacks

Allow remote connections to PostgreSQL

https://stackoverflow.com/questions/22080307/access-postgresql-server-from-lan

Going Live with Django, MySQL on Ubuntu and server going down

Ever got to deploy day only to have errors streaming in about the site not being able to be accesses. Most likely in your logs or error emails / alerts you get something like this:

(2003, "Can't connect to MySQL server on '127.0.0.1' (111)")

What the hell does this mean, it means MySQL is down and if it was running it means it has crashed. Whoah but if you check the site now it is working…that means nothing the website is unstable as certain actions make it go down. We need to achieve system stability.

The most common reason for MySQL going down is that it is using more memory than physical memory can allocate it then uses virtual memory and crashes soon after that.

What can I do?

First double check the error logs and see if there is a configuration error causing this issue. If there is no clear warning you want to check memory usage.

Check if you are using virtual memory with vmstat


procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0      0 499228  72040 358540    0    0    16    27  230  280  2  1 97  0  0

Check the amount of memory being used:


ps aux | grep mysqld

A few good commands to use during the process:

In mySQL:

show global status;

Source:

Best Practices for Configuring Optimal MySQL Memory Usage