One of the things I’ve always liked least about building web applications is dealing with mod_rewrite. It’s a very useful feature, but it’s quirky and the config languages for webservers are difficult to use (at least from my experience with Apache and Nginx). But like it or not, mod_rewrite is often a necessary part of a web app. Until now…
Recently I had to redo the rewrite rules for pivotallabs.com when we switched from Apache to Nginx, which we did when moving to EngineYard’s cloud hosting. Since then our Nginx config has grown to over 150 lines, mainly to deal with multiple virtual hosts.
Now, managing a custom Nginx config on the EY cloud system isn’t as simple as I’d like, especially when the configs are different on production and demo environments. (Demo is what we call our usual environment for doing feature acceptance.) It’s far easier to use the automatically generated config, but that doesn’t work when you need to support multiple domain names.
The obvious thing to do was to move the rewrite/redirect logic out of the Nginx config. I found a couple Rack middleware components that did something sort of like what we needed, but none of them were sufficient for what we needed. So we created our own.
Refraction is a Rack middleware replacement for mod_rewrite. With Refraction we were able to replace our 150+ line Nginx config with a 50 line Ruby file, and go back to using the standard automatically generated config on EY cloud.
Here’s an example Refraction config file:
Refraction.configure do |req|
feedburner = "http://feeds.pivotallabs.com/pivotallabs"
if req.env['HTTP_USER_AGENT'] !~ /FeedBurner|FeedValidator/ && req.host =~ /pivotallabs.com/
case req.path
when %r{^/(talks|blabs|blog).(atom|rss)$} ; req.found! "#{feedburner}/#{$1}.#{$2}"
when %r{^/users/(chris|edward)/blog.(atom|rss)$} ; req.found! "#{feedburner}/#{$1}.#{$2}"
end
else
case req.host
when 'tweed.pivotallabs.com'
req.rewrite! "http://pivotallabs.com/tweed#{req.path}"
when /([-w]+.)?pivotallabs.com/
# passthrough with no change
else # wildcard domains (e.g. pivotalabs.com)
req.permanent! :host => "pivotallabs.com"
end
end
end
These rules are extracted from the full config file for pivotallabs.com. They redirect high-traffic syndication feeds to feedburner, rewrite a subdomain (tweed.pivotallabs.com) to a path for that sub-site (pivotallabs.com/tweed), and redirect some aliases to our standard domain name (pivotalabs anyone?).
Refraction is thread-safe, which means you can put it outside the Rack::Lock, something we felt was important for performance. It will never have the performance of mod_rewrite, but it will certainly be better than handling redirections in Rails itself.
Full documentation is available in the README. Contributions welcome.
And of course big thanks to Sam Pierson and Wai Lun Mang who both paired with me on developing Refraction.
Thanks for this, much slicker than learning the nginx rewrite syntax and portable if I want to run elsewhere.
Just to confirm, do rack middlewares run fine on heroku?
November 20, 2009 at 6:40 am
I believe here is already middleware to do this, and I think it was better (cant remember off hand)
October 24, 2009 at 2:38 pm
There is at least one other option for doing mod_rewrite like functions with Rack middleware, called [Rack::Rewrite](http://coderack.org/users/jtrupiano/entries/37-rackrewrite). But, it doesn’t appear to be quite as fully featured as what is shown in Refraction.
For one thing, Rack::Rewrite only really gives you access to the request URI, whereas Refraction appears to give the entire request object… that can be much more useful for determining where to send someone.
October 26, 2009 at 6:40 am
There is a 0.1.3 branch ( http://github.com/jtrupiano/rack-rewrite/tree/0.1.3 ) for Rack::Rewrite that will allow you to access the Rack env and thus write arbitrary rewrite rules as demonstrated in the example in this post.
Surely I’m biased, but I think the DSL for Rack::Rewrite is much cleaner.
In any case, good to see multiple options out there.
October 26, 2009 at 8:12 am
John: I of course looked at your Rewrite module. Didn’t see the branch or anything about the env being available, but that’s cool. Though while your DSL might be cleaner in some cases, it’s also more limited than just using plain old Ruby and doesn’t let us use logic we need for our rules (which are probably more complicated than your average ruleset). I think the way you resort to using procs is a good indication of why we didn’t want to go the DSL route. I’d be curious to see which approach gives better performance – I can see how the DSL approach would have opportunities for optimization our way wouldn’t, but I haven’t spent much time looking at that yet.
October 26, 2009 at 8:23 am
Hey Josh,
0.1.3 introduces two ways to highly customize your rewrite rules. Guards (via :if => Proc option) give you the ability to define arbitrary conditions for when a rule matches an incoming request. Additionally, the :to parameter to rules can now be a Proc allowing you to arbitrarily define how a rule is applied. (Note that the Rack env is not yielded yet for these two features in the HEAD of 0.1.3 — it’s my last task before releasing 0.1.3.)
Honestly, it wasn’t my intention to completely recreate mod_rewrite. My initial use case only required the basic rewrite rules. I spent time this weekend trying to devise replacement rewrite rules for the capistrano maintenance page feature. This proved very challenging because Apache’s globbing makes it much easier to do negative matching than Ruby’s regular expression engine. Consider this rewrite rule:
RewriteCond %{REQUEST_URI} !.(css|jpg|png)$
RewriteCond %{DOCUMENT_ROOT}/system/maintenance.html -f
RewriteCond %{SCRIPT_FILENAME} !maintenance.html
RewriteRule ^.*$ /system/maintenance.html [L]
That first rewrite condition is terribly difficult to write without negative lookahead support (only available in Oniguruma). Using Ruby 1.8 w/o Oniguruma, this rule looks ugly in Rack::Rewrite
r301 /.*/, ‘/system/maintenance.html’, :if => Proc.new { |from|
maintenance_file = File.join(RAILS_ROOT, ‘public’, ‘system’, ‘maintenance.html’)
File.exists?(maintenance_file) && !%w(css jpg png).any? {|ext| Regexp.new(“.#{ext}$”)}
}
With Oniguruma (default in 1.9), it’s much cleaner:
r301 /(.*)$(?< !css|png|jpg)/, '/system/maintenance.html', :if => Proc.new { |from|
File.exists?(File.join(RAILS_ROOT, ‘public’, ‘system’, ‘maintenance.html’))
}
What’s most troubling about this though is that I have to issue a redirect because rails doesn’t find the route /system/maintenance.html and we get a 404 if I simply rewrite it.
By having the rule enforced by Apache, the rails app never gets invoked to render the maintenance page, so it’s a non-issue in the standard capistrano snippet.
Do you have any ideas on this might be solved? What would the associated Refraction configuration look like?
October 26, 2009 at 8:52 am
Josh, feel free to delete my last two comments — I think this one fixes it.
Code Snippet 1: Standard Capistrano Maintenance Rewrite Rules
RewriteCond %{REQUEST_URI} !.(css|jpg|png)$
RewriteCond %{DOCUMENT_ROOT}/system/maintenance.html -f
RewriteCond %{SCRIPT_FILENAME} !maintenance.html
RewriteRule ^.*$ /system/maintenance.html [L]
Code Snippet 2: Ruby 1.8 Rack::Rewrite Replacement Rule
r301 /.*/, ‘/system/maintenance.html’, :if => Proc.new { |from|
maintenance_file = File.join(RAILS_ROOT, ‘public’, ‘system’, ‘maintenance.html’)
File.exists?(maintenance_file) && !w(css jpg png).any? {|ext| Regexp.new(“.#{ext}$”)}
}
Code Snippet 3: Ruby 1.9 Rack::Rewrite Replacement Rule
r301 /(.*)$(?< !css|png|jpg)/, '/system/maintenance.html', :if => Proc.new { |from|
File.exists?(File.join(RAILS_ROOT, ‘public’, ‘system’, ‘maintenance.html’))
}
October 26, 2009 at 9:44 am
John: I cleaned up your malformatted comments. No worries.
Here’s how I’d rewrite that rule in Refraction:
if File.exists?(File.join(RAILS_ROOT, ‘public’, ‘system’, ‘maintenance.html’))
req.rewrite! ‘/system/maintenance.html’ unless req.path =~ /css|png|jpg/
end
But as you mentioned, doing this rewrite in Rack middleware is probably a bad idea, since it requires Rack to be running, which it won’t be when the app server is down. This is the case where I’d let the web server handle the rewrite. EngineYard Cloud takes care of this in the auto-generated Nginx config, so I don’t worry about it myself. YMMV. Or maybe the maintenance page needs its own middleware, if rewriting in the web server isn’t an option.
October 26, 2009 at 10:31 am