Pivotal Labs

Main menu

Skip to primary content
Skip to secondary content
  • About
  • Case Studies
  • Team
    • Executives
    • Locations
      • San Francisco (HQ)
      • Boston
      • Boulder
      • Denver
      • London
      • Los Angeles
      • New York
  • Community
    • Blogs
    • Tech Talks
    • Events
  • Careers
    • Lifestyle
    • Principles & Practices
    • Benefits
    • FAQ
    • Apply
  • Tools
  • Contact
    • Press Room
    • Press Releases
    • In The News
    • Press Kit
  • All
  • Labs
  • Standup
  • Tracker

Monthly Archives: December 2007

Collapsing Migrations

Alex Chaffee
Wednesday, December 12, 2007

(6:30 pm: updated to use mysqldump)
(12/14/07: updated to remove db:reset since the Rails 2.0 version now does something different.)
(12/15/07: updated to not set ENV['RAILS_ENV'] since that gets passed down to child processes)

There was an old hacker who lived in a shoe; she had so many migrations she didn’t know what to do. Every time her build ran clean, she spent a whole minute staring at the screen.

Fortunately, she read this blog post and now her db:setup task is so fast she’s started building multiple test environments so she can run tests in parallel!

  • Figure out what migration to collapse to. This number should be less than or equal to the oldest deployed version of your app. E.g. if most of your deployments are on version 348 but there’s one client running a branch that’s only up to version 298, then pick 298 (or 297 if you’re afraid of off-by-one errors). For this example we will use 100.

  • Install lib/tasks/db.rake and lib/db_tasks.rb (source below)

  • Clear the development database by running

    rake db:clear

  • Dump the development structure by running

    rake db:dump

  • Delete all the migrations up to and including your target version. Here’s a sneaky awk script that deletes everything up to and including 100. (Go ahead and run it, it won’t bite, and you can always revert.)

    ls db/migrate/ | awk ‘{split($0, a, “_”); if(a[1]<=100) print $0}’ | xargs svn rm

  • Create a new migration called “100_collapsed_migrations.rb” using the following template.

100_collapsed_migrations.rb:

class CollapsedMigrations < ActiveRecord::Migration
  def self.up
    sql = <<-SQL
  # development_structure.sql goes here
    SQL

    execute("SET FOREIGN_KEY_CHECKS=0")
    sql.split(";").each do |statement|
      execute(statement)
    end
  ensure
    execute("SET FOREIGN_KEY_CHECKS=1")
  end

  def self.down
    raise IrreversibleMigration
  end
end
  • Open up db/development_dump.sql and copy its entire contents into your clipboard, then paste it above the “SQL” line in your new migration 100.

  • Search for the statement that creates the schema_info table and remove it.

Mine looks like this:

CREATE TABLE `schema_info` (
  `version` int(11) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
  • Set up your databases and run your tests.

    rake db:setup test

  • Congratulations! Your migrations are now blazingly fast, just like back in the (scaff)old days. You can run “rake db:setup” any time you get a svn update that looks like it may have done something funky to your schema, rather than shying away from that minute-long migration and just hoping your tests still pass.

Why do we need to use db:dump rather than db:schema:dump? Well, unfortunately, db:schema:dump doesn’t dump everything. It misses CONSTRAINT statements and also seems to get the charset wrong (although that may have been a function of how I constructed the db in my test). And db:structure:dump misses any data that may have been added by your migrations.

Here’s my current db.rake. Unfortunately, it only works with MySQL, but if you want to make it support your favorite DB (or even your least favorite) then please go right ahead.

Oh, and that part about multiple test environments and parallellized tests? Stay tuned… :-)

db.rake:

require "db_tasks"

namespace :db do
  def tasks
    (@db_tasks ||= DbTasks.new(self))
  end

  desc "Drop and recreate database"
  task :clear => :environment do
    tasks.clear
  end

  desc "Clear and migrate dev and test databases, and load fixtures into development db"
  task :setup => :environment do
    tasks.setup
  end

  desc "Dump the current environment's database schema and data to, e.g., db/development_dump.sql (optional param: FILE=foo.sql)"
  task :dump => :environment do
    if ENV['FILE']
      tasks.dump ENV['FILE']
    else
      tasks.dump
    end
  end

  desc "Load an sql file (by default db/development_dump.sql). (Optional param: FILE=foo.sql)"
  task :load => :environment do
    if ENV['FILE']
      tasks.load ENV['FILE']
    else
      tasks.load
    end
  end
end

db_tasks.rb:

# This creates a duplicate of the database config for a db config as defined in database.yml.
# For example, if the "test" database is named "myapp_test",
# for clone number 0, the new environment is named "test0", and the database is "myapp_test0".
# All other settings are preserved (esp. username and password).
module ActiveRecord
  class Base
    def self.clone_config(original_config, worker_number)
      original = configurations[original_config.to_s]
      raise "Could not find conguration '#{original_config}' to clone" if original.nil?
      worker_config = original.dup
      worker_config["database"] += worker_number.to_s
      configurations["#{original_config}#{worker_number}"] = worker_config
    end
  end
end

class DbTasks
  def initialize(rake)
    @rake = rake
  end

  def init
    connect_to('development')
    clear_database
    migrate_database
    dump
    test_environments.each do |test_db|
      if test_db =~ /([0-9]+)$/
        clone_test_config($1.to_i)
      end
      connect_to(test_db)
      clear_database
      load
    end
  end

  # db:clear -> drop and create db for RAILS_ENV
  def clear
    clear_database
  end

  # db:setup -> drop, create, and migrate dbs for test and development environments, and import fixtures into development
  def setup
    init
    connect_to 'development'
    load_fixtures
  end

  def dump(file = "#{RAILS_ROOT}/db/#{environment}_dump.sql")
    puts "Dumping #{database} into #{file}"
    system "mysqldump #{database} -u#{username} #{password_parameter} --default-character-set=utf8 > #{file}"
  end

  def load(sql_file = "#{RAILS_ROOT}/db/development_dump.sql")
    puts "Loading #{sql_file} into #{database}"
    query('SET foreign_key_checks = 0')
    sql_file = File.expand_path(sql_file)
    IO.readlines(sql_file).join.split(";").each do |statement|
      query(statement.strip) unless statement.strip == ""
    end
    query('SET foreign_key_checks = 1')
  end

  protected

  def clone_test_config(worker_num)
    ActiveRecord::Base.clone_config("test", worker_num)
  end

  def connect_to(environment)
    ActiveRecord::Base.establish_connection(environment)
    @environment = environment
    Object.const_set(:RAILS_ENV, environment)
    # Note: don't set ENV['RAILS_ENV'] since that gets passed down to invoked tasks (including 'rake test')
  end

  def environment
    (@environment ||= RAILS_ENV)
  end

  def test_environments
    environments = ['test']
    if Object.const_defined?(:TEST_WORKERS)
      TEST_WORKERS.times do |worker_num|
        environments << "test#{worker_num}"
      end
    end
    environments
  end

  def load_fixtures
    puts "Loading fixtures into #{environment}"
    Rake::Task["db:fixtures:load"].invoke
  end

  def clear_database
    puts "Clearing #{environment} database"
    sql = "drop database if exists #{database}; create database #{database} character set utf8;"
    cmd = %Q|mysql -u#{username} #{password_parameter} -e "#{sql}"|
    # puts "executing #{cmd.inspect}"
    system(cmd)
  end

  def migrate_database
    puts "Migrating #{environment} database"
    ActiveRecord::Migration.verbose = false
    Rake::Task["db:migrate"].invoke
  end

  def config(env = environment)
    ActiveRecord::Base.configurations[env]
  end

  def query(sql)
    ActiveRecord::Base.connection.execute(sql)
  end

  def database
    config["database"]
  end

  def username
    config["username"]
  end

  def password
    config["password"]
  end

  def password_parameter
    if password.nil? || password.empty?
      ""
    else
      "-p#{password}"
    end
  end

  def execute(cmd)
    puts "t#{cmd}"
    unless system(cmd)
      puts "tFailed with status #{$?.exitstatus}"
    end
  end

  def system(cmd)
    @rake.send(:system, cmd)
  end
end
  • 0 Shares
  • Share on Facebook
  • Share on Twitter
Joe Moore

Thoughts on Linus Torvalds's Git Talk

Joe Moore
Wednesday, December 12, 2007

At Pivotal Labs last week we watched Linus Torvald’s Google talk about Git, the Source Code Management (SCM) system he wrote and uses to manage the Linux kernel code.

I’ve watched it twice now and here are some thoughts, based on quotes and themes from the video.

“I Never Care About Just One File”

Linus stated that one of the reasons Git was wonderful for him is that, as a high level code maintainer, he needs to merge thousands of files at once. In fact, he stated that he never cares about just one file.

Not so for me. As an in-the-trenches developer, my whole life is caring about just one file, over and over again. When I merge, I care about each file because, since I work on small teams and with small codebases, there is a fairly high likelihood that my changes will collide with those from another developer.

“The Repository Must Be Decentralized…. You Must Have a Network of Trust”

Linus made the point that central repositories suck for large projects where the morons must not have commit access — only the super privileged are allowed to commit code back to the repo. He argues that Git is better because it is a decentralized network of repositories — there is no central master, only Some Dudes who have repositories. Usually there is Some Dude In Charge, like Linus, and everyone tends to pull code from them. To update the “master” code version, Some Dude In Charge pulls code from the repositories owned by Some Other Wicked Smart Dudes, who have most likely pulled code from Some Other Trusted Dudes (And One Gal), and so on. Thus, rather than limit access to just the hand-selected few, everyone has their own local copy of the repository, and the smart merge from the smart who merge from the smart, resulting in some kind of official or de facto version.

While I like the local copy of the repo idea, Pivotal does not work the way Linus describes… but Pivotal is weird, in a good way. We all have full commit rights. Our network of trust is everyone. The Dude In Charge is named Continuous Integration. CI makes the official versions. CI runs the tests. CI makes sure that the deploy process works. I’m sure that we could coerce Git into working in a centralized-like way, where it merges automatically from the individual developers and runs the builds, but I’m not sure if that would be forcing a square peg into a penguin-shaped hole.

“Some Companies Use Git And Don’t Even Know It”

Linus described how developers at some companies use Git on their development machines, committing their changes and merging fellow developer’s changes with Git, then pushing those changes to central SVN repos. He rather mocked this, but it actually sounds like a good solution: developers merge, so use the tool that’s good at that. CI machines and deploy machines love centralized master repositories, so use that for those jobs.

“It Does Not Matter How Easy It Is To Branch, Only How Easy It Is to Merge”

Well said. I never thought about that before but he is completely right. I could never put my finger on why I never branch in SVN, even though it’s practically ‘free’ and easy to do. Now it’s obvious: who cares how easy it is to branch when merging sucks? Git is supposed to make merging incredibly easy because Git is content-aware rather than just file-aware… or something like that. I’ll believe it when I see it, but if Git really does make merging highly divergent branches easy then I’ll give it a try.

Joe’s Take

I’d like to try Git, especially if it makes branching and merging those branches as easy as Linus suggests, but I don’t think that Pivotal would get as much benefit out of it as large, distributed open source projects. A ‘really big’ project might have 10 developers, not thousands, and all must have commit rights. Our network of trust goes like this: if you are here, we trust you; if we don’t trust you, you have to leave. And the idea of having to merge directly from my fellow developers sounds like a pain in the ass… why would I want to merge from 3 separate pairs when I can pull code from the central repo and be reasonably sure (thanks to CI) that it is clean and green? Hopefully I’ll be able to answer those questions soon by using Git on a project.

(Note: originally posted on my personal blog at http://40withegg.com/2007/12/11/thoughts-on-linus-torvalds-s-git-talk)

  • 0 Shares
  • Share on Facebook
  • Share on Twitter
Pivotal Labs

OpenSocial

Pivotal Labs
Tuesday, December 11, 2007

Okay, it’s not ready for prime time. But it was entertaining digging into OpenSocial and making some gadgets for the Orkut sandbox. The ability to write arbitrary data easily to a feed stream from the client side is going to be very handy.

Now I just want someone to explain to me why there are already several projects on Rubyforge that claim to be Rails OpenSocial projects. Someone should just port from Shindig after it’s ready.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter
Pivotal Labs

Your Object Mama

Pivotal Labs
Tuesday, December 11, 2007

I’m more and more convinced that Object Mothers are the way to go to manage test data. I owe my introduction to this pattern to David Goudreau.

A few users and items in fixture data so that you can start up your application and test without adding things from the UI… and then let your tests be self-sufficient: make an Object Mother that generates objects for you with sensible defaults. Instead of relying on fixtures, you know you have an object that nobody else will change. Excellent for reducing test fragility.

It can make builds a bit slower, but the test robustness is worth the tradeoff to me.

I’m busy adding it to my Selenium Tests today and basking in test robustness.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter
Pivotal Labs

Pivots at RubyConf

Pivotal Labs
Tuesday, December 11, 2007

The Confreaks site has released a slew of videos from the 2007 RubyConf. The videos are in the perfect format: side by side streams with the projection material on one side and the stream of the speaker on the other.

Many great sessions are covered, including a good one on the Treetop parser presented by Pivotal Labs’ own Nathan Sobo. As a bonus, in the audience you can spot other Pivots you may be familiar from this weblog (Nick Kallen and Brian Takita).

  • 0 Shares
  • Share on Facebook
  • Share on Twitter
Pivotal Labs

Making Ruby Look Like <strike>Smalltalk</strike> <strike>Haskell</strike> <strike>Erlang</strike> Ruby

Pivotal Labs
Sunday, December 9, 2007

As Seen on TV

Inspired by Haskell

Did you ever want to write Ruby Code like:

x = 1
increment(x).by(6)

Now you can:

def increment(variable)
  chain do
    by do |delta|
      variable + delta
    end
  end
end

This is an OO version of a technique called Currying:

g = 'hello world'.index_of('o')
h = g.starting_at(6)

Inspired by Smalltalk:

'hello world' indexOf: $o startingAt: 6

Let’s do this in Ruby:

class String
  def index_of(substring)
    chain do
      starting_at do |starting_at|
        ...
      end
    end
  end
end

Now, in Ruby:

'hello world'.index_of('o').starting_at(6)

Inspired by Erlang

Here is pseudo-code for an interesting iteration pattern. If the Actor receives ‘lock’ it will not respond to any messages until it receives ‘unlock’:

loop(X) ->
  receive
    'incr' -> loop(X+1)
    'lock' ->
      receive
        'unlock' ->
          loop(X);
      end
  end.

The Ruby equivalent:

def loop(x)
  puts x # added puts just to see what's going on
  chain do
    incr do
      loop(x+1)
    end
    lock do
      unlock do
        loop(x)
      end
    end
  end
end

Try this:

loop(1).incr.incr.incr => prints 1, 2, 3, then 4

Now, the finale: We can respond to incr any number of times till we’re locked; then, we respond to no messages other than unlock; once we’ve received unlock we proceed as before.

loop(1).incr.lock.incr => prints 1, 2, then raises an exception.
loop(1).incr.lock.unlock.incr => prints 1, 2, then 3

How does this work?

The call to chain do ... end creates a new Chain object with the block passed in to the constructor. Chain is kind of “blank slate”: all methods inherited from Object are undefined so that any messages it receives go through method missing. The block the Chain is instantiated with is instance-eval’d in the chain’s context, and all method invocations go through method missing (because of the blank slate). Method missing has two cases. It either dynamically defines a method returning a new link in the Chain (in the case of nested chaining), or it delegates the method back to the object that constructed the chain in the first place. Let’s consider examples of these two cases.

Case 1, dynamically defining a new method:

def foo
  chain do
    a do # define a method named :a on the Chain.
      1
    end
  end
end

foo.a => 1

Case 2, delegating the method back the the creator of the Chain:

def bar
  1
end

def foo
  chain do
    a do
      bar # invokes the bar defined above
    end
  end
end

foo.a => 1

Nested chaining is just a variation on Case 1:

def foo
  chain do
    a do
      b do # create a nested Chain (i.e., a Link)
        1
      end
    end
  end
end

foo.a.b => 1

The only gotcha is knowing whether a method invoked with a block belongs to the object that created the chain or is a nested chain:

def b(&block)
end

def foo
  chain do
    a do
      b do # is this the above b, or a nested Chain?
        ...
      end
    end
  end
end

We prioritize the #b defined on the parent object, rather than created a nested chain (I feel this is more intuitive).

Here is the source code:

require 'rubygems'
require 'active_support'

class Chain
  instance_methods.each { |m| undef_method m unless m =~ /(^__|^nil?$|^send$|^instance_exec$)/ }
  delegate :define_method, :respond_to, :to => :__caller
  attr_accessor :__caller

  def __has_links?
    @__has_links
  end

  def initialize(*args, &block)
    if block_given?
      self.__caller = eval("self", block.binding)
      instance_exec *args, &block
    end
  end

  def method_missing(method, *args, &block)
    if block_given? && !__caller.respond_to?(method)
      @__has_links = true
      metaclass.module_eval do
        define_method method do |*args|
          __link(*args, &block)
        end
      end
    else
      __caller.send(method, *args, &block)
    end
  end

  private
  def __link(*args, &block)
    link = Chain.new
    link.__caller = __caller
    result = link.instance_exec(*args, &block)
    link.__has_links?? link : result
  end

  def metaclass
    class << self
      self
    end
  end
end

def chain(&block)
  Chain.new &block
end
  • 0 Shares
  • Share on Facebook
  • Share on Twitter
Pivotal Labs

Ruby Quiz (A Trick Question)

Pivotal Labs
Sunday, December 9, 2007

Here is a little Ruby trivium for you.

Type this into IRB:

def foo
  def bar
    1
  end
end

foo.bar
=> 1

Is this some magical lightweight object creation syntax so you can do cool method chaining? Let’s try another example:

def foo
  def foo
    1
  end
end

foo
=> nil

foo.foo
=> 1

So far so good. But now, type:

foo
=> 1

WTF? Is this a defect in Ruby?? Post your responses in the comments.

(Warning: this is a trick question)
  • 0 Shares
  • Share on Facebook
  • Share on Twitter
Pivotal Labs

Get Rake to always show the error stack trace for your project

Pivotal Labs
Friday, December 7, 2007

Tired of rake hiding your error stack trace?


rake aborted!
Build failed

(See full trace by running task with --trace)

You can have rake always show your error stack trace by going into your project’s Rakefile and setting:


Rake.application.options.trace = true

Now you never need to worry about passing –trace again.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

Topics

  • agile (783)
  • rails (117)
  • testing (90)
  • ruby (86)
  • ruby on rails (71)
  • jobs (62)
  • javascript (59)
  • techtalk (44)
  • ironblogger (42)
  • rspec (39)
  • bloggerdome (34)
  • productivity (34)
  • activerecord (30)
  • rubymine (30)
  • git (29)
  • gogaruco (29)
  • nyc (27)
  • design (24)
  • mobile (23)
  • pivotal tracker (22)
  • process (21)
  • cucumber (21)
  • jasmine (19)
  • ios (18)
  • tracker ecosystem (17)
  • webos (17)
  • objective-c (17)
  • fun (16)
  • android (16)
  • palm (16)
  • ci (16)
  • "soft" ware (16)
  • bdd (15)
  • tdd (15)
  • cedar (15)
  • rails3 (14)
  • performance (14)
  • css (14)
  • gem (13)
  • mouse-free development (12)
  • selenium (12)
  • goruco (12)
  • bundler (12)
  • api (12)
  • keyboard (11)
  • meetup (11)
  • railsconf (11)
  • nyc-standup (11)
  • capybara (10)
  • mac (10)
Subscribe to Community Feed
  1. ←
  2. 1
  3. 2
  4. 3
  • About
  • Case Studies
  • Team
  • Community
  • Careers
  • Tools
  • Contact
  • Labs
  • Events

Contact Us

contact@pivotallabs.com
+1 415-77-PIVOT
TwitterLinkedInFacebook

Pivotal Tracker

Tracker is the award-winning agile project management tool that enables real-time collaboration around a shared, prioritized backlog.
Visit pivotaltracker.com >