Moving a Production MySQL Database to Amazon RDS with Minimal Downtime

Untitled
I recently had to move our production database at Listia.com from a custom MySQL installation over to Amazon RDS.  The big benefits of RDS for us were automated backups, automatic failover to another availability zone, quick instance size scaling with no downtime (edit: up to 3 minutes of downtime, can be more if there are issues) , and super easy slave deployment.  We had been managing all of it ourselves over the past year with just 2-3 people (btw, we are well-funded and hiring now!) and it was taking up a lot of our time to maintain everything while scaling from 0 to over 20 million pageviews a month.

Everyone hates downtime, but for us it was even more critical to keep it to the absolute minimum.  Listia is an auction-style marketplace with hundreds of items ending every hour.  It's really unfair to users if we interrupt auctions because people love bidding at the last minute and if we are down while an auction is ending, that could affect the outcome by 100% or more.  So, we found a window late at night where no auctions were ending within several minutes and set out to find a way to migrate the entire database within that window.

One solution would have been to run our new RDS instance as a slave for a while and then switch it over to be the new master as we disconnect the old one, but unfortunately RDS does not let you run those instances as slaves, so we were out of luck there.  The other solution I thought of was to simply stop the production database, copy over the data files using a utility or script and then launch RDS, switch over IPs, etc.  RDS made this impossible again because you have no access to the underlying file system.

After a lot of different ideas including simply taking the site offline for an hours or so.. I finally figured out what I needed to do.  Simply dump the production database using mysqldump and the following options:

mysqldump --single-transaction --master-data=2 -C -q <my_production_database_name> -u root -p > backup.sql

If you are using InnoDB tables, this will dump your database to a file without locking the tables (--single-transaction) AND save the bin log position in that file as a comment (--master-data=2).  This is important because the dump itself takes a while to complete and loading it into RDS using the command below takes another 30 minutes to an hour in our case, and we didn't want the site down for that long as I mentioned before.

mysql -u root -p -h <rds_hostname> <my_production_database_name> < backup.sql

The cool thing is that I could now open up the backup.sql file to find the last bin log position in the comments at the top.  With this knowledge, I could use the following command to have RDS catch up to the production database that was still running:

mysqlbinlog mysql-bin.002971 --start-position=73319797 | mysql -u root -p -h <rds_hostname> <my_production_database_name>
This example assumes we left off at bin log file 002971 and position 73319797

On top of that, I could keep doing this throughout the day or just right before our migration as long as I remembered what position I was at to make sure our new RDS instance was only a few moments behind the old production database right before we made the switch.

Then, when everything was ready.. all I had to do was put up our maintenance page, sever all connections to our production database, run mysqlbinlog one last time to make sure RDS is fully caught up (which only takes a second since we made sure it was very close already), switch the IPs, and re-deploy our app.  After that, the maintenance page is removed and all traffic starts flowing to RDS.  Thinking back, it was a pretty simple procedure, but it definitely took us a while to figure out what to do... hopefuilly this will be useful to anyone else thinking about moving to RDS.

If you love working on problems like this, here's one more shamelss plug for our open positions at Listia :-)  Oh, and if you have a better solution or find something wrong with ours, I'd absolutely love to hear it as well.

#sus09

(download)

Road trip

(download)

Visiting Posterous

Visited the Posterous crew at their SF office today.

RC Car Drift 2

(download)

RC car drifting

(download)

Video test

(download)

Anybots Robot

(download)