John Pignata's blog



John PignataJohn Pignata
Converting Rails application data from MySQL to PostgreSQL
edit Posted by John Pignata on Monday November 23, 2009 at 06:07PM

One of our projects had a pending chore in Tracker to move its backend to PostgreSQL from MySQL. This project has about a quarter of a million rows of production data and around a hundred tables in its schema which needed to be exactly migrated into PostgreSQL.

Forklifting the data proved more complicated than expected due to incompatibilities in the two DBMS' syntax such as in the way string escaping worked, how booleans were represented and a bunch of other small but painful differences. Despite MySQL's mysqldump utility including a command-line option to write statements in PostgeSQL format, it became clear that it wasn't going to be simple to create a repeatable procedure to do this work across our environments.

There's a bunch of information out there about how to approach this problem but none felt right. Most are multi-step manual procedures that require altering a dump file using sed or perl and others require the data to be loaded into an intermediary database and massaged prior to import. After testing some of these approaches, Todd and I decided to timebox ourselves to an hour to test the viability of a Ruby script using the DBI gem to move the data. We came up with:

require 'dbi'
require 'dbd/mysql'
require 'dbd/pg'

begin
  mysql = DBI.connect("DBI:Mysql:source:localhost", "username", "password")
  postgres = DBI.connect("DBI:Pg:destination:localhost", "username", "password")

  mysql.select_all("SHOW TABLES") do |table|
    next if ['schema_migrations', 'sessions'].include?(table.to_s)
    select = mysql.execute("SELECT * FROM #{table}")
    columns = select.column_names.map { |key| "\"#{key}\"" }.join(', ')
    placeholders = (['?'] * select.column_names.size).join(', ')
    insert = postgres.prepare("INSERT INTO #{table} (#{columns}) VALUES(#{placeholders})")
    select.each { |row| insert.execute(*row) }
    insert.finish
  end
rescue DBI::DatabaseError => e
  puts "Error #{e.err}: #{e.errstr}"
ensure
  mysql.disconnect if mysql
  postgres.disconnect if postgres
end

Our antiquely Perl-like script worked better than we expected — our application started right up with all of its data intact.

Has anybody out there encountered this need before? What kinds of solutions did you come up with?