Jeff Dean's blog



Jeff DeanJeff Dean
Equality and sameness in Ruby
edit Posted by Jeff Dean on Monday June 29, 2009 at 06:43AM

Let's say you are building a leetspeak that deals with w00ts. You might write a class that looks like this:

class Woot
  def ==(other)
    true
  end
end

In theory, any Woot is equal to anything else:

puts Woot.new == Woot.new # true

You might think that with this setup, you could do something like this:

x = [ Woot.new ]
y = [ Woot.new ]
z = x - y

You might expect z to be an empty array in the case, but oh how wrong you would be. In the example above, the == is never called at all.

Jeff DeanJeff Dean
Sanitizing POST params with custom Rack middleware
edit Posted by Jeff Dean on Thursday June 11, 2009 at 04:48AM

The problem: Improperly escaped post data

I recently worked on an app that processed xml files. Once a week, a legacy system posted a large xml document to the app. For almost a year the app worked perfectly, and then we updated to rails 2.3.2 and the posts started failing spectacularly. Looking at the log files, I noticed that the params were incorrect:

<code>{"message"=>"hello", "xml"=>"<xml>Foo &amp", "Bar</xml>"=>nil, "action"=>"not_scrubbed", "controller"=>"examples"}</code>

After looking into it further, I realized that the data that was being posted contained semi-colons:

<code>xml=<xml>Foo %26amp; Bar</xml>&message=hello</code>

It turns out that rails used to only split params on ampersands, but that rack splits on both ampersands and semi-colons. We couldn't change the legacy system, so we had to remove the semi-colons before the post params got to rails.

The solution: Rack middleware

Using Rack middleware it's was easy to insert code before rails params parsing code executed. To start, build a class that conforms to the signature of a rack middleware layer, like so:

Jeff DeanJeff Dean
AutoTagger 0.9 released
edit Posted by Jeff Dean on Tuesday May 05, 2009 at 05:50PM

I'm happy to announce that AutoTagger 0.9 has been released thanks to Brian Takita and Mike Grafton. This resolves a few major issues and brings AutoTagger a big step closer to being ready for prime-time.

You can read more at http://github.com/zilkey/auto_tagger.

You can install the gem like so:

gem sources -a http://gems.github.com
sudo gem install zilkey-auto_tagger

Thanks Brian and Mike!

Jeff DeanJeff Dean
Testing capistrano recipes with cucumber
edit Posted by Jeff Dean on Sunday April 05, 2009 at 10:29PM

In this post, I'll show you how to set up end-to-end Capistrano testing using Cucumber. I've extracted this from the cucumber features I wrote for a gem I'm building named auto_tagger. To fully test capistrano recipes, your tests will have to:

  • Create a local git repository
  • Create a local app with a config/deploy.rb file
  • Push the app to the local repository
  • Run cap deploy:setup from the app (which will setup a directory inside your local test directory)
  • Run a cap deploy from the app (which will deploy to your test directory)
  • Assert against the content of the deployed app in the test directory

Background - Capistrano recipes are almost never tested

Looking around online, I couldn't find a single list of capistrano packages that has an automated test suite, even ones from some big hosts. It's no surprise that Capistrano tasks are seldom tested - testing capistrano recipes is hard, and even when you do test them, there are still so many variables in real-life deploys that you can't account for everything.

It's like Rummy said:

There are known knowns. There are things we know that we know. There are known unknowns. That is to say, there are things that we now know we don’t know. But there are also unknown unknowns. There are things we do not know we don’t know.

from wikipedia

However, there are some things you can do to stave off the "known unknowns". For example, you know that someone might forget to set an important variable in their cap task and you know they might be using cap-ext-multistage. For these kinds of examples, Capistrano testing can give you much more assurance that a bug in your recipe is less likely to rm -rf /* on your remote machine.

AutoTagger is a gem that helps you automatically create a date-stamped tag for each stage of your deployment, and deploy from the last tag from the previous environment.

Let's say you have the following workflow:

  • Run all test on a Continuous Integration (CI) server
  • Deploy to a staging server
  • Deploy to a production server

You can use the autotag command to tag releases on your CI box, then use the capistrano tasks to auto-tag each release.

Jeff DeanJeff Dean
Speeding up slow Cruise Control response times
edit Posted by Jeff Dean on Tuesday March 10, 2009 at 03:52PM

We use Cruise Control on our Continuous Integration server and we have several ways of getting alerts about the status of the build, including email, RSS and the Cruise Control web interface.

Recently we noticed that the web interface and the rss feeds were taking very long to respond, on the order of 1 minute or more. After poking around, we realized that we had hundreds of serialized builds still on disk.

$ cd ~/OurCruiseDirectory/projects/OurProject 
$ rm -r build-*

Then, to make sure this doesn't happen again, we edited our OurCruiseDirectory/site_config.rb site_config.rb to decrease the number of builds we keep:

BuildReaper.number_of_builds_to_keep = 20

So it appears that the time Cruise Control responses take is directly proportional to the number of builds saved on the server.

Jeff DeanJeff Dean
New York Standup 3/4/2009
edit Posted by Jeff Dean on Thursday March 05, 2009 at 05:08PM

Interesting

While this has been mentioned before, naming an ActiveRecord association :target will cause infinite recursion. Especially lame if you are building an app for assassins or mobsters.

The tracker team upgraded to 2.2 and saw a big increase in the size of their mongrels, and much longer start-up times.

In an erector widget it appears that respond_to? checks arity. For example:

self.respond_to?("some_method") # => false
self.respond_to?("some_method", some_value) # => true

Jeff DeanJeff Dean
Moving from Subversion to Git
edit Posted by Jeff Dean on Thursday October 16, 2008 at 01:58AM

Moving from Subversion to Git

We recently moved our project from subversion to git, and so far the move has gone very smoothly. The following post will detail what we did to make the move.

Our setup

For this project there 2 pairs working and we have 6 machines and one hosted service:

  • Two Mac OSX workstations with IDEA
  • One continuous integration server running Cruise Control
  • A staging server running a 2-year old version of Ubuntu
  • A subversion server
  • A production server hosted on Engine Yard
  • A Github account with the ability to create private repositories

The goal was to have one pair continue to work while the migration from svn to git was happening.

Jeff DeanJeff Dean
New York Standup 10/9/2008
edit Posted by Jeff Dean on Thursday October 09, 2008 at 01:29PM

Helps

What is capistrano multistage?

  • A plugin that allows you to store environment-specific variables in different files, and specify a default environment
  • It's been on tracker for a while and seems to be stable
  • Other options are just specifying your environment-specific variables within separate tasks - keeping everything in one file

Jeff DeanJeff Dean
New York Standup 10/7/2008
edit Posted by Jeff Dean on Wednesday October 08, 2008 at 01:10PM

Interesting Things

When you specify a gem from a custom source, and it has dependencies on a separate source, you need to list both sources in geminstaller.yml.

This comes up when you are installing a gem from github and that gem depends on other gems from rubyforge. You can specify multiple sources by adding more --source attributes.

Other articles: