By Steve Conover and Brian Takita
Peer-to-Patent, one of Pivotal Labs’ clients, got Slashdotted last week, and we had no trouble handling the load. The site was just as responsive as it always is, and we didn’t come close to having a scale problem.
Moral of the story: the technology for serving static web pages is old, boring, and extremely scalable. If you have the type of site that can be page-cached, do so aggressively, starting with the front page and any pages likely to be linked to. We got a huge payoff for the engineering time that we invested in our page-caching strategy.
- We moved away from Rails page-caching and developed our own “holeless cache”, which uses a symlink trick (see below) to instantly and “holelessly” switch to a new version of a cached page. (The cache “hole” is the time between the expiration or purge of a cached page and the time when it’s regenerated. The danger is that in that time your Mongrels can be saturated with requests – something we proved to ourselves could easily happen.)
Here’s our symlink trick, using the front page as an example:
- Have index.html point to index.html.current
- If (index.html.current is >= 20 minutes old)
- Copy index.html.current to index.html.old
- Point index.html to index.html.old
- Rewrite index.html.current by asking Rails for the page (using the process method)
- Repoint index.html back at index.html.current
- Repeat step 2 every minute using a cron job.
For cache expiration that’s model-based, we make a call from the model observer class to our holeless cache routine, instead of using Rails cache sweepers. So, instead of just deleting the cached page we regenerate it in place.
Performance/load testing: we tried several tools and approaches and found that a simple Ruby script that launches wget requests (that write to /dev/null) in many separate threads worked best for us.
We send down exactly one .js and one .css file. If you are sending down more than one of each of these to the browser, you have a performance problem. Fix it with asset packager.
Update: one clarification about the cron job: we deploy this “automatically” using capistrano.