Josh SusserJosh Susser
Announcing Refraction
edit Posted by Josh Susser on Saturday October 24, 2009 at 11:31AM

One of the things I've always liked least about building web applications is dealing with mod_rewrite. It's a very useful feature, but it's quirky and the config languages for webservers are difficult to use (at least from my experience with Apache and Nginx). But like it or not, mod_rewrite is often a necessary part of a web app. Until now...

Recently I had to redo the rewrite rules for pivotallabs.com when we switched from Apache to Nginx, which we did when moving to EngineYard's cloud hosting. Since then our Nginx config has grown to over 150 lines, mainly to deal with multiple virtual hosts.

Now, managing a custom Nginx config on the EY cloud system isn't as simple as I'd like, especially when the configs are different on production and demo environments. (Demo is what we call our usual environment for doing feature acceptance.) It's far easier to use the automatically generated config, but that doesn't work when you need to support multiple domain names.

The obvious thing to do was to move the rewrite/redirect logic out of the Nginx config. I found a couple Rack middleware components that did something sort of like what we needed, but none of them were sufficient for what we needed. So we created our own.

Refraction is a Rack middleware replacement for mod_rewrite. With Refraction we were able to replace our 150+ line Nginx config with a 50 line Ruby file, and go back to using the standard automatically generated config on EY cloud.

Here's an example Refraction config file:

Refraction.configure do |req|
  feedburner  = "http://feeds.pivotallabs.com/pivotallabs"

  if req.env['HTTP_USER_AGENT'] !~ /FeedBurner|FeedValidator/ && req.host =~ /pivotallabs.com/
    case req.path
    when %r{^/(talks|blabs|blog).(atom|rss)$}        ; req.found! "#{feedburner}/#{$1}.#{$2}"
    when %r{^/users/(chris|edward)/blog.(atom|rss)$} ; req.found! "#{feedburner}/#{$1}.#{$2}"
    end
  else
    case req.host
    when 'tweed.pivotallabs.com'
      req.rewrite! "http://pivotallabs.com/tweed#{req.path}"
    when /([-\w]+.)?pivotallabs.com/
      # passthrough with no change
    else  # wildcard domains (e.g. pivotalabs.com)
      req.permanent! :host => "pivotallabs.com"
    end
  end
end

These rules are extracted from the full config file for pivotallabs.com. They redirect high-traffic syndication feeds to feedburner, rewrite a subdomain (tweed.pivotallabs.com) to a path for that sub-site (pivotallabs.com/tweed), and redirect some aliases to our standard domain name (pivotalabs anyone?).

Refraction is thread-safe, which means you can put it outside the Rack::Lock, something we felt was important for performance. It will never have the performance of mod_rewrite, but it will certainly be better than handling redirections in Rails itself.

Full documentation is available in the README. Contributions welcome.

And of course big thanks to Sam Pierson and Wai Lun Mang who both paired with me on developing Refraction.

Jeff DeanJeff Dean
Sanitizing POST params with custom Rack middleware
edit Posted by Jeff Dean on Thursday June 11, 2009 at 04:48AM

The problem: Improperly escaped post data

I recently worked on an app that processed xml files. Once a week, a legacy system posted a large xml document to the app. For almost a year the app worked perfectly, and then we updated to rails 2.3.2 and the posts started failing spectacularly. Looking at the log files, I noticed that the params were incorrect:

<code>{"message"=>"hello", "xml"=>"<xml>Foo &amp", "Bar</xml>"=>nil, "action"=>"not_scrubbed", "controller"=>"examples"}</code>

After looking into it further, I realized that the data that was being posted contained semi-colons:

<code>xml=<xml>Foo %26amp; Bar</xml>&message=hello</code>

It turns out that rails used to only split params on ampersands, but that rack splits on both ampersands and semi-colons. We couldn't change the legacy system, so we had to remove the semi-colons before the post params got to rails.

The solution: Rack middleware

Using Rack middleware it's was easy to insert code before rails params parsing code executed. To start, build a class that conforms to the signature of a rack middleware layer, like so: