Pivotal Labs

Main menu

Skip to primary content
Skip to secondary content
  • About
  • Case Studies
  • Team
    • Executives
    • Locations
      • San Francisco (HQ)
      • Boston
      • Boulder
      • Denver
      • London
      • Los Angeles
      • New York
  • Community
    • Blogs
    • Tech Talks
    • Events
  • Careers
    • Lifestyle
    • Principles & Practices
    • Benefits
    • FAQ
    • Apply
  • Tools
  • Contact
    • Press Room
    • Press Releases
    • In The News
    • Press Kit
  • All
  • Labs
  • Standup
  • Tracker

Parallelize Your RSpec Suite

Pivotal Labs
Friday, May 8, 2009

We all have multi-core machine these days, but most rspec suites still run in one sequential stream. Let’s parallelize it!

The big hurdle here is managing multiple test databases. When multiple specs are running simultaneously, they each need to have exclusive access to the database, so that one spec’s setup doesn’t clobber the records of another spec’s setup. We could create and manage multiple test database within our RDBMS. But I’d prefer something a little more … ephemeral, that won’t hang around after we’re done, or require any manual management.

Enter SQLite’s in-memory database, which is a full SQLite instance, created entirely within the invoking process’s own memory footprint.

(Note #1: the gist for this blog is at http://gist.github.com/108780)

(Note #2: The following strategy is relatively well-known, but I thought it might be useful for Pivots-and-friends to see exactly how one Pivotal project has used this tactic for a big speed win.)

Here’s the relevant section of our config/database.yml:

test-in-memory:
  adapter: sqlite3
  database: ':memory:'

Next, we need a way to indicate to the running rails process that it should use the in-memory database. We created an initializer file, config/intializers/in-memory-test.db:

def in_memory_database?
  ENV["RAILS_ENV"] == "test" and
    ENV["IN_MEMORY_DB"] and
    Rails::Configuration.new.database_configuration['test-in-memory']['database'] == ':memory:'
end

if in_memory_database?
  puts "connecting to in-memory database ..."
  ActiveRecord::Base.establish_connection(Rails::Configuration.new.database_configuration['test-in-memory'])
  puts "building in-memory database from db/schema.rb ..."
  load "#{Rails.root}/db/schema.rb" # use db agnostic schema by default
  #  ActiveRecord::Migrator.up('db/migrate') # use migrations
end

Note that in the above, we’re initializing the in-memory database with db/schema.rb, so make sure that file is up-to-date. (Or, you could uncomment the line that runs your migrations.)

Let’s give that a whirl:

$ IN_MEMORY_DB=1 RAILS_ENV=test ./script/console
Loading test environment (Rails 2.3.2)
connecting to in-memory database ...
building in-memory database from db/schema.rb ...
-- create_table("users", {:force=>true})
   -> 0.0065s
-- add_index("users", ["deleted_at"], {:name=>"index_users_on_deleted_at"})
   -> 0.0004s
-- add_index("users", ["id", "deleted_at"], {:name=>"index_users_on_id_and_deleted_at"})
   -> 0.0003s

...

>>

Super, we can see that the database is being initialized our of our schema.rb, and we get our console prompt. We’re ready to roll!

But, running this:

IN_MEMORY_DB=yes spec spec

will still only result in a single process, albeit one running off a database that’s entirely in-memory. We want parallelization!

The final step is a script that will run your spec suite for you. You may need to edit this for your particular situation, but then again, maybe not.

#  spec/suite.rb

require "spec/spec_helper"

if ENV['IN_MEMORY_DB']
  N_PROCESSES = [ENV['IN_MEMORY_DB'].to_i, 1].max
  specs = (Dir["spec/**/*_spec.rb"]).sort.in_groups_of(N_PROCESSES)
  processes = []

  interrupt_handler = lambda do
    STDERR.puts "caught keyboard interrupt, exiting gracefully ..."
    processes.each { |process| Process.kill "KILL", process }
    exit 1
  end

  Signal.trap 'SIGINT', interrupt_handler
  1.upto(N_PROCESSES) do |j|
    processes << Process.fork {
      specs.each do |array|
        if array[j-1]
          require array[j-1]
        end
      end
    }
  end
  1.upto(N_PROCESSES) { Process.wait }

else
  (Dir["spec/**/*_spec.rb"]).each do |file|
    require file
  end
end

Then, you simply run IN_MEMORY_DB=2 spec spec/suite.rb to run two parallel processes. Increase the number on larger machines for better results!

There’s room for improvement here, notably in the naive method used to allocate the spec files to processes, but even as simple as this method is, our spec suite runs in about half the time it used to, on a dual-core machine.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

8 Comments

  1. Brian Guthrie says:

    Not sure if I missed something, but would it be possible to use DeepTest (http://github.com/qxjit/deep-test/tree/master) to do the same thing? Just tell it the number of workers you’d like to use in your declared SpecTask and it’ll handle the parallelization for you.

    May 8, 2009 at 11:01 pm

  2. Mike Dalessio says:

    Ah, now I feel shame. DeepTest appears to do exactly this for RSpec.

    However, one thing I’d like to do is extend the above solution to support Cucumber. It’s not obvious from DeepTest’s docs whether it’s capable of doing this out of the box.

    What’s interesting is that DeepTest’s management of MySQL databases would support running Selenium tests, which an in-memory db cannot do (because the runner and the server are in separate processes).

    So, I may instead try to hack DeepTest to support Cucumber. Thanks for the pointer!

    May 9, 2009 at 5:52 pm

  3. Alex Chaffee says:

    I’d *love* it if you hack [http://deep-test.rubyforge.org/](Deep Test) to support Cucumber. I imagine that DeepTest + Cucumber + WebRat + Selenium => smoking CPU (which in this case is a good thing, since you’re not wasting cycles waiting for the browser’s I/O).

    It looks like someone forked it onto [GitHub](http://github.com/qxjit/deep-test) so if you do your work there, let me know and I’ll get in touch with Dan to synchronize the [http://deep-test.rubyforge.org/](Rubyforge version).

    May 10, 2009 at 5:29 pm

  4. Alex Chaffee says:

    I’d *love* it if you hack [Deep Test](http://deep-test.rubyforge.org/) to support Cucumber. I imagine that DeepTest + Cucumber + WebRat + Selenium => smoking CPU (which in this case is a good thing, since you’re not wasting cycles waiting for the browser’s I/O).

    It looks like someone forked it onto [GitHub](http://github.com/qxjit/deep-test) so if you do your work there, let me know and I’ll get in touch with Dan to synchronize the [Rubyforge version](http://deep-test.rubyforge.org/).

    May 10, 2009 at 5:31 pm

  5. Bryan Helmkamp says:

    Hey guys,

    You should check out my project Testjour (http://github.com/brynary/testjour/tree/master). It parallelizes Cucumber runs over SSH and handles MySQL database management. Check out Testjour’s own Cucumber features for example usage.

    We use it to run our giant (11k steps) Cucumber build across mac minis in the office. I’m planning on extending it with RSpec support soon, and I’ve got a long term goal of integrating it with EC2.

    Cheers,

    -Bryan

    May 10, 2009 at 8:44 pm

  6. Brian Guthrie says:

    Just a heads up, all – the repository on Github is the official one. It’s maintained by David Vollbracht, one of the main contributors to and maintainers of DeepTest, and (in cooperation with Dan Manges) it’s considered official now. The commits made there are propogated over to Rubyforge; use it, and contribute to it (yay!), in preference to that one. One of these days we’ll coordinate it with the ThoughtWorks GitHub account; apologize for the confusion, and stay tuned.

    May 14, 2009 at 4:25 am

  7. grosser says:

    I made a plugin of those scripts, and changed some things that did not work out for me (like loading spec/spec_helper first) http://github.com/grosser/parallel_specs hope you like it or can contribute :)

    May 17, 2009 at 11:00 am

  8. Dan says:

    We also do this across EC2, we deal with the DBs (Mysql, Postgres, Sqlite) and currently support Test:Unit and Rspec. We are seriously considering Cucumber, but are still evaluating the demand.

    Anyways we would love to get feedback from additional users so if you have any projects you want to try out just let me know and I would be happy to hook you up with an account.

    July 8, 2009 at 9:50 pm

Add New Comment Cancel reply

Your email address will not be published.

Pivotal Labs

Pivotal Labs

Recent Posts

  • Does the set of all sets contain itself?
  • Standup 3/8/2012
  • Standup 3/7/2012
Subscribe to Pivotal's Feed

Author Topics

riddles (1)
agile (167)
capistrano (2)
rails (26)
movember (1)
git (10)
railsdoc (1)
object-design (1)
bdd (3)
cucumber (3)
linkedin (1)
oauth (1)
ruby (17)
tdd (2)
lvh.me (1)
rails 3.1.1 (1)
selenium (6)
homebrew (1)
mysql (5)
rvm (1)
sproutcore (1)
paperclip (2)
pry (1)
amazon (1)
heroku (1)
rails3 (2)
jasmine (3)
design (3)
process (12)
productivity (8)
learning (1)
olin (1)
migrations (2)
mongodb (2)
devise (2)
javascript (13)
rubymine (4)
ipad (1)
whurl (1)
head.js (1)
pairing (2)
tools (4)
pair programming (1)
rspec (10)
rspec2 (1)
ruby19 (1)
incubation (3)
startup (5)
api (1)
presenter (1)
vanna (1)
pivotal tracker (5)
capybara (1)
fakeweb (1)
webmock (1)
intern (1)
ruby on rails (25)
meetup (1)
textmate (1)
testing (20)
solr (4)
nyc-standup (11)
community (1)
opensource (3)
activerecord (4)
chrome (1)
mp4 (1)
activeresource (1)
flash (3)
neo4j (1)
nginx (1)
rsoc (1)
meta programming (1)
agile standup (7)
government (3)
webos (4)
xss (1)
jquery (1)
bundler (2)
ci (3)
gems (5)
postgresql (1)
geminstaller (1)
gemcutter (1)
cloud (2)
rack (2)
refraction (1)
gem (5)
refactoring (1)
validations (1)
webrat (1)
engine-yard (1)
firefox (2)
jsunit (1)
mongrel (2)
thin (1)
unicorn (1)
facebook (1)
rubygems (5)
jruby (1)
actioncontroller (1)
rails 2.3 (1)
palmpre (1)
autotest (1)
mac (2)
hosting (1)
goruco (11)
database (3)
railsconf (11)
gogaruco (4)
deployment (4)
github (1)
ie (1)
ajax (1)
intellij (1)
json (1)
asset packaging (1)
polonium (1)
character encoding (1)
utf-8 (1)
test (3)
civics (1)
hpricot (1)
rake (3)
sms (1)
unicode (1)
iphone (1)
java (1)
safari (1)
memory leaks (1)
rr (3)
editor (1)
css (1)
nyc (3)
performance (5)
fun (5)
enterprise rails (1)
health (1)
new and cool (1)
general (2)
treetop (1)
errors (1)
stack (1)
trace (1)
cache (1)
cookies (1)
freesoftware (1)
conferences (1)
development (1)
driven (1)
proxy (1)
caching (1)
peertopatent (1)
languages (1)
rest (2)
rubyforge (1)
sake (1)
file (1)
upload (1)
constants (1)
osx (1)
terminal (1)
pairprogramming (2)
  • About
  • Case Studies
  • Team
  • Community
  • Careers
  • Tools
  • Contact
  • Labs
  • Events

Contact Us

contact@pivotallabs.com
+1 415-77-PIVOT
TwitterLinkedInFacebook

Pivotal Tracker

Tracker is the award-winning agile project management tool that enables real-time collaboration around a shared, prioritized backlog.
Visit pivotaltracker.com >