Pivotal Labs

Main menu

Skip to primary content
Skip to secondary content
  • About
  • Case Studies
  • Team
    • Executives
    • Locations
      • San Francisco (HQ)
      • Boston
      • Boulder
      • Denver
      • London
      • Los Angeles
      • New York
  • Community
    • Blogs
    • Tech Talks
    • Events
  • Careers
    • Lifestyle
    • Principles & Practices
    • Benefits
    • FAQ
    • Apply
  • Tools
  • Contact
    • Press Room
    • Press Releases
    • In The News
    • Press Kit
  • All
  • Labs
  • Standup
  • Tracker

Collapsing Migrations

Alex Chaffee
Wednesday, December 12, 2007

(6:30 pm: updated to use mysqldump)
(12/14/07: updated to remove db:reset since the Rails 2.0 version now does something different.)
(12/15/07: updated to not set ENV['RAILS_ENV'] since that gets passed down to child processes)

There was an old hacker who lived in a shoe; she had so many migrations she didn’t know what to do. Every time her build ran clean, she spent a whole minute staring at the screen.

Fortunately, she read this blog post and now her db:setup task is so fast she’s started building multiple test environments so she can run tests in parallel!

  • Figure out what migration to collapse to. This number should be less than or equal to the oldest deployed version of your app. E.g. if most of your deployments are on version 348 but there’s one client running a branch that’s only up to version 298, then pick 298 (or 297 if you’re afraid of off-by-one errors). For this example we will use 100.

  • Install lib/tasks/db.rake and lib/db_tasks.rb (source below)

  • Clear the development database by running

    rake db:clear

  • Dump the development structure by running

    rake db:dump

  • Delete all the migrations up to and including your target version. Here’s a sneaky awk script that deletes everything up to and including 100. (Go ahead and run it, it won’t bite, and you can always revert.)

    ls db/migrate/ | awk ‘{split($0, a, “_”); if(a[1]<=100) print $0}’ | xargs svn rm

  • Create a new migration called “100_collapsed_migrations.rb” using the following template.

100_collapsed_migrations.rb:

class CollapsedMigrations < ActiveRecord::Migration
  def self.up
    sql = <<-SQL
  # development_structure.sql goes here
    SQL

    execute("SET FOREIGN_KEY_CHECKS=0")
    sql.split(";").each do |statement|
      execute(statement)
    end
  ensure
    execute("SET FOREIGN_KEY_CHECKS=1")
  end

  def self.down
    raise IrreversibleMigration
  end
end
  • Open up db/development_dump.sql and copy its entire contents into your clipboard, then paste it above the “SQL” line in your new migration 100.

  • Search for the statement that creates the schema_info table and remove it.

Mine looks like this:

CREATE TABLE `schema_info` (
  `version` int(11) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
  • Set up your databases and run your tests.

    rake db:setup test

  • Congratulations! Your migrations are now blazingly fast, just like back in the (scaff)old days. You can run “rake db:setup” any time you get a svn update that looks like it may have done something funky to your schema, rather than shying away from that minute-long migration and just hoping your tests still pass.

Why do we need to use db:dump rather than db:schema:dump? Well, unfortunately, db:schema:dump doesn’t dump everything. It misses CONSTRAINT statements and also seems to get the charset wrong (although that may have been a function of how I constructed the db in my test). And db:structure:dump misses any data that may have been added by your migrations.

Here’s my current db.rake. Unfortunately, it only works with MySQL, but if you want to make it support your favorite DB (or even your least favorite) then please go right ahead.

Oh, and that part about multiple test environments and parallellized tests? Stay tuned… :-)

db.rake:

require "db_tasks"

namespace :db do
  def tasks
    (@db_tasks ||= DbTasks.new(self))
  end

  desc "Drop and recreate database"
  task :clear => :environment do
    tasks.clear
  end

  desc "Clear and migrate dev and test databases, and load fixtures into development db"
  task :setup => :environment do
    tasks.setup
  end

  desc "Dump the current environment's database schema and data to, e.g., db/development_dump.sql (optional param: FILE=foo.sql)"
  task :dump => :environment do
    if ENV['FILE']
      tasks.dump ENV['FILE']
    else
      tasks.dump
    end
  end

  desc "Load an sql file (by default db/development_dump.sql). (Optional param: FILE=foo.sql)"
  task :load => :environment do
    if ENV['FILE']
      tasks.load ENV['FILE']
    else
      tasks.load
    end
  end
end

db_tasks.rb:

# This creates a duplicate of the database config for a db config as defined in database.yml.
# For example, if the "test" database is named "myapp_test",
# for clone number 0, the new environment is named "test0", and the database is "myapp_test0".
# All other settings are preserved (esp. username and password).
module ActiveRecord
  class Base
    def self.clone_config(original_config, worker_number)
      original = configurations[original_config.to_s]
      raise "Could not find conguration '#{original_config}' to clone" if original.nil?
      worker_config = original.dup
      worker_config["database"] += worker_number.to_s
      configurations["#{original_config}#{worker_number}"] = worker_config
    end
  end
end

class DbTasks
  def initialize(rake)
    @rake = rake
  end

  def init
    connect_to('development')
    clear_database
    migrate_database
    dump
    test_environments.each do |test_db|
      if test_db =~ /([0-9]+)$/
        clone_test_config($1.to_i)
      end
      connect_to(test_db)
      clear_database
      load
    end
  end

  # db:clear -> drop and create db for RAILS_ENV
  def clear
    clear_database
  end

  # db:setup -> drop, create, and migrate dbs for test and development environments, and import fixtures into development
  def setup
    init
    connect_to 'development'
    load_fixtures
  end

  def dump(file = "#{RAILS_ROOT}/db/#{environment}_dump.sql")
    puts "Dumping #{database} into #{file}"
    system "mysqldump #{database} -u#{username} #{password_parameter} --default-character-set=utf8 > #{file}"
  end

  def load(sql_file = "#{RAILS_ROOT}/db/development_dump.sql")
    puts "Loading #{sql_file} into #{database}"
    query('SET foreign_key_checks = 0')
    sql_file = File.expand_path(sql_file)
    IO.readlines(sql_file).join.split(";").each do |statement|
      query(statement.strip) unless statement.strip == ""
    end
    query('SET foreign_key_checks = 1')
  end

  protected

  def clone_test_config(worker_num)
    ActiveRecord::Base.clone_config("test", worker_num)
  end

  def connect_to(environment)
    ActiveRecord::Base.establish_connection(environment)
    @environment = environment
    Object.const_set(:RAILS_ENV, environment)
    # Note: don't set ENV['RAILS_ENV'] since that gets passed down to invoked tasks (including 'rake test')
  end

  def environment
    (@environment ||= RAILS_ENV)
  end

  def test_environments
    environments = ['test']
    if Object.const_defined?(:TEST_WORKERS)
      TEST_WORKERS.times do |worker_num|
        environments << "test#{worker_num}"
      end
    end
    environments
  end

  def load_fixtures
    puts "Loading fixtures into #{environment}"
    Rake::Task["db:fixtures:load"].invoke
  end

  def clear_database
    puts "Clearing #{environment} database"
    sql = "drop database if exists #{database}; create database #{database} character set utf8;"
    cmd = %Q|mysql -u#{username} #{password_parameter} -e "#{sql}"|
    # puts "executing #{cmd.inspect}"
    system(cmd)
  end

  def migrate_database
    puts "Migrating #{environment} database"
    ActiveRecord::Migration.verbose = false
    Rake::Task["db:migrate"].invoke
  end

  def config(env = environment)
    ActiveRecord::Base.configurations[env]
  end

  def query(sql)
    ActiveRecord::Base.connection.execute(sql)
  end

  def database
    config["database"]
  end

  def username
    config["username"]
  end

  def password
    config["password"]
  end

  def password_parameter
    if password.nil? || password.empty?
      ""
    else
      "-p#{password}"
    end
  end

  def execute(cmd)
    puts "t#{cmd}"
    unless system(cmd)
      puts "tFailed with status #{$?.exitstatus}"
    end
  end

  def system(cmd)
    @rake.send(:system, cmd)
  end
end
  • 0 Shares
  • Share on Facebook
  • Share on Twitter

6 Comments

  1. JB says:

    A lot of people use migrations to seed data as well as structure. Regardless of whether this is a good practice, can you tell us whether the structure dump includes data, or just schema?

    December 12, 2007 at 11:50 pm

  2. Tim Connor says:

    Sweet, I was just thinking about collapsing our hefty migrations at work. They’re getting to be a bit of a pain with lots of seed data migrations for really outdated schemas. This is a good excuse to do it and then fixturize our dev seed data.

    December 12, 2007 at 11:50 pm

  3. Alex C says:

    The structure dump does not include data, which makes a certain amount of sense because loading fixtures wipes out all data regardless of whether it was added by migrations or later. But it is a flaw with the current code, since some tables may not be fixturized, and I’m working right now on making it use mysqldump rather than db:structure:dump.

    Too bad, I was hoping to use the built-in Rails rake tasks as much as possible to make it more shareable (and easier to port to other db engines).

    December 12, 2007 at 11:50 pm

  4. Alex Chaffee says:

    There have been some recent threads on the rails-core list about this topic (db setup and seed data).

    * http://groups.google.com/group/rubyonrails-core/browse_thread/thread/92e6bdccf4a05feb
    * http://groups.google.com/group/rubyonrails-core/browse_thread/thread/d871469cb2a6589a
    * http://groups.google.com/group/rubyonrails-core/browse_thread/thread/eddc46fd661d6be9

    We’ve been using db:setup — which drops, creates, and migrates test and dev dbs, and loads fixtures into dev — for about 1-1/2 years now and it works well for us. Some of our use cases are:

    * A fresh checkout of the project on a new workstation. (This happens at least once a week, between branches, laptops, new people, etc.)
    * An update which altered migrations. Note that at least some of the time, we will alter migrations which have already been checked in. This is because we do continuous integration, so it’s better for us to check in something that’s imperfect and then improve it later. Once we push code to a live site, of course, the migrations are fixed. But during development, it’s not unusual for a developer to send an email around saying “please run db:setup on the next update — I changed 024_create_universe.rb”.
    * Our Continuous Build box needs to run db:setup on every build.

    Basically, there are so many little things involved with database setup that it’s nice to have a single, simple command you can run to make sure you’re up to date. “Did you run db:setup?” becomes a common refrain when someone’s complaining of a broken build.

    Also, we use (abuse?) migrations by doing two things that are apparently not done by the rails core team: seed data (mostly for things like enumerations, but sometimes things like the admin user or the default data set), and extra SQL like referential integrity constraints or fancy indexes (which aren’t supported by the db:schema output).

    December 13, 2007 at 3:09 pm

  5. Alex C says:

    More thread:
    http://groups.google.com/group/rubyonrails-core/browse_thread/thread/fb324a9f722cdf7e?hl=en

    February 22, 2008 at 12:45 am

  6. Alex Chaffee says:

    The new Rails task “db:migrate:reset” does *almost* what db:setup does. It drops, creates and migrates — but it only does it for one database (by default, development). db:setup does it for test and development, which means it can serve the purpose of “make it work” after a questionable update or during a continuous build.

    April 6, 2008 at 5:12 pm

Add New Comment Cancel reply

Your email address will not be published.

Alex Chaffee

Recent Posts

  • Code Monkey
  • RubyGems Warningitis Outbreak
  • Upgrading your Rakefile from RSpec 1.3 to RSpec 2
Subscribe to Alex's Feed

Author Topics

ruby (14)
gem (5)
ruby on rails (24)
erector (2)
rspec (4)
activerecord (4)
database (3)
sinatra (1)
postgresql (1)
css (1)
html (1)
git (2)
mac (5)
java (3)
agile (12)
iphone (1)
migrations (1)
fun (5)
dot (1)
graphing (1)
subversion (1)
test (1)
demeter (1)
puzzler (1)
  • About
  • Case Studies
  • Team
  • Community
  • Careers
  • Tools
  • Contact
  • Labs
  • Events

Contact Us

contact@pivotallabs.com
+1 415-77-PIVOT
TwitterLinkedInFacebook

Pivotal Tracker

Tracker is the award-winning agile project management tool that enables real-time collaboration around a shared, prioritized backlog.
Visit pivotaltracker.com >