Ever got to deploy day only to have errors streaming in about the site not being able to be accesses. Most likely in your logs or error emails / alerts you get something like this:
(2003, "Can't connect to MySQL server on '127.0.0.1' (111)")
What the hell does this mean, it means MySQL is down and if it was running it means it has crashed. Whoah but if you check the site now it is working...that means nothing the website is unstable as certain actions make it go down. We need to achieve system stability.
The most common reason for MySQL going down is that it is using more memory than physical memory can allocate it then uses virtual memory and crashes soon after that.
What can I do?
First double check the error logs and see if there is a configuration error causing this issue. If there is no clear warning you want to check memory usage.
Check if you are using virtual memory with
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 1 0 0 499228 72040 358540 0 0 16 27 230 280 2 1 97 0 0
Check the amount of memory being used:
ps aux | grep mysqld
A few good commands to use during the process:
show global status;