Script for Easy MySQL Migrations

This is a bit of code I wrote to handle MySQL migration scripts for my projects. This was inspired by the South project for Django, which was not meeting my needs (especially on non-Django projects), but had some good ideas. To use the migrate.py script, create a directory and put all your migration scripts in it, then execute the migrate.py script.

How to do it…

Create a migration directory and fill it with files starting with a 4 digit number and ending with .sql:

migration directory

An example migration script might be 0001_adding_two_new_tables.sql:

CREATE TABLE `zipcode` (
	`id` integer AUTO_INCREMENT NOT NULL PRIMARY KEY,
	`value` integer NOT NULL
);

CREATE TABLE `state` (
	`id` integer AUTO_INCREMENT NOT NULL PRIMARY KEY,
	`value` integer NOT NULL
);

On the same level as the directory, put the migrate.py script and execute it:

python migrate.py

How it works…

The script needs database access to record executed scripts, so you will need to define:

HOST_NAME = mysql_host_name
USER_NAME = mysql_user_name
USER_PASS = mysql_password
DATABASE  = mysql_db_name

TABLE_NAME= db_migrations

Additionally, you can configure the path to the migration directory and the extension used for the migration files; below are the defaults:

PATH_TO_MIGRATIONS = migrations
MIGRATION_EXTENSION = sql

The script, first connects to the database and evaluates if the TABLE_NAME exists, creating the table when doesnt exist:

CREATE_TABLE_SQL = """
 CREATE TABLE `%s`.`%s` (
`id` MEDIUMINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`execute_on` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`script_name` VARCHAR( 79 ) NOT NULL
) ENGINE = InnoDB
"""

# find the last run script in the DB
conn = MySQLdb.connect (host = HOST_NAME,
						user = USER_NAME,
						passwd = USER_PASS,
						db = DATABASE)

cursor = conn.cursor()

cursor.execute("SHOW TABLES LIKE %s" % TABLE_NAME)

if not cursor.fetchone():
	cursor.execute(CREATE_TABLE_SQL % (DATABASE, TABLE_NAME))


Next the script reads the last executed script from the database and all the files in the PATH_TO_MIGRATIONS ending with MIGRATION_EXTENSION. It orders the scripts that have not been executed numerically using the 4 digit prefix (0009 before 0010, before 0011, before 0100, etc.). All text after the number is ignored by the system; use it for your own understanding (must be under 60 characters in length):

# find the migration # from the script name
def _get_number(filename):
	m = re.search(\d+, filename)
	return m.group(0)
	
cursor.execute(SELECT `script_name` FROM `%s` ORDER BY `id` DESC % TABLE_NAME)
rs = cursor.fetchone()

if rs:
	last_script_num = int(_get_number(rs[0]))
	process_scripts = False
else:
	last_script_num = 0
	process_scripts = True

# order the files by their migration #
fileset = []
sortset = []
filemap = {}

for filename in glob.glob( os.path.join(PATH_TO_MIGRATIONS, *.%s % MIGRATION_EXTENSION) ):
	key = _get_number(filename)

	if int(key) > last_script_num:
		sortset.append(key)
		filemap[key] = filename

sortset.sort()
print sortset

Lastly, the script iterates over the sorted files and executes each one of them, adding a record to the TABLE_NAME so that migrations are not run twice. If there is an SQL error, the script will stop and write the exception to the standard error output:

try:
#	iterate over the files after the last migration script run
	for key in sortset:
		filename = filemap[key]
		# execute queries from file
		file = open(filename,"r")
		print filename
		sql = file.read()
#		print sql
		file.close()
		cursor.execute(sql)
		cursor.close()

		# notify database that migration was run
		cursor = conn.cursor()
#		print "INSERT INTO `%s` (`script_name`) VALUES (%s)" % (TABLE_NAME, filename)
		cursor.execute( "INSERT INTO `%s`.`%s` (`script_name`) VALUES (%s);" % (DATABASE, TABLE_NAME, filename) )
		cursor.execute("commit")
except Exception as e:
	sys.stderr.write(e)
	exit()

cursor.close()
conn.close()

Theres more…

This tool works really well for most SQL operations, but can have difficultly with foreign key constraints on INNODB typed databases. If you remove a constraint from a column and then change a column name, or change a column name and then add a constraint, an errno 150 will occur. I believe this is because the each migration script is executed as a single statement, and so the transaction hasn&rsquot;t completed before you are referencing the new value. I have worked around this by never removing or adding foreign key constraints in the same migrations that I change column names. However, a future improvement to this script might be to read complete SQL statements from the migrations and execute them one at a time.