Sam Pierson's blog



Sam PiersonSam Pierson
Standup 8/19/2010: Bundler and RVM Gemsets, Rails 3 and JQuery, Why Day
edit Posted by Sam Pierson on Thursday August 19, 2010 at 10:03AM

Interesting

  • Bundler and RVM Gemsets work! For a while now, Bundler has been putting gems in the system gems location (instead of a private folder as it did previously). This means that it now works well with RVM Gemsets. Use It.
  • Rails 3 & JQuery: Someone has written a nice generator that will unplug Prototype from Rails 3 and install JQuery.
  • Ruby 1.9.2 was released yesterday.
  • Today is international Why Day, commemorating the day that Why the Lucky Stiff disappeared from the online community. Interestingly August 19th is also the day that 3 witches where put on trial in Samlesbury, England in 1612 and that 5 witches where executed in Salem, Massachusetts in 1692. Coincidence? I think not.

Sam PiersonSam Pierson
Standup 8/18/2010: Time#to_json in milliseconds, Encoded vs. encrypted Session Cookies
edit Posted by Sam Pierson on Wednesday August 18, 2010 at 09:11AM

Interesting

  • Make Ruby Time#to_json always return time in milliseconds: It turns out that while Firefox and Chrome can, Safari cannot parse the default time format that Time#to_json produces. The team decided to override Time#as_json to return an integer number of milliseconds, which to_json will then render into a string (JavaScript can easily work with number-of-milliseconds-since-the-start-of-the-epoch).
  • A pivot wanted to remind everyone that in Rails 2.x, session cookies are not encrypted. Reassuringly, all present were already aware of this. Session cookies are Base64 encoded, and if you ever need to take a look at their contents, here's how. If you want to encrypt your session cookies, there are Rails plugins available for that purpose.

Sam PiersonSam Pierson
Standup 8/17/2010: RubyMine shortcuts, client visible Cucumber results
edit Posted by Sam Pierson on Tuesday August 17, 2010 at 09:23AM

Help

  • RubyMine Keyboard Shortcuts:

    • How do I move from one side of a split editor window to the other side? Answer: Ctrl-Tab.
    • So how do I figure this out for myself? Helpful-Non-Answers: Cmd-Shift-A pulls up a search box that you can type in commands (e.g. "move") and get suggestions. Preferences -> Keymap has an even better search box, and also a reverse-search (click the funnel next to the search box) that lets you type in a keystroke and see what command it is. Of course, unless you know that the thing you need is called the "Switcher", you're still not going to find out. Sometimes you just have to resort to interacting with another human (let's call it "using the meat-net"). Company wide standups FTW.
  • How do I make the results of our integration tests (e.g. using cucumber) easily visible to a non-technical product owner by putting them on a web-page, a là Fit? One answer: use the --format html option, capture the output and copy to a web server. One project also has a url that a client can visit that will cause the cucumber tests to be run.

Sam PiersonSam Pierson
Selenium in the cloud, saucelabs.com and the SaucelabsAdapter
edit Posted by Sam Pierson on Thursday February 18, 2010 at 03:45PM

Pivotal has recently started experimenting with a service that runs Selenium tests 'in the cloud' at saucelabs.com. Sauce Labs was founded by Jason Huggins, the creator of Selenium.

The first question that comes to my mind when such services get mentioned is "do I have to give them a copy of our source code and database so they can run this test for us?". In the case of Sauce Labs the answer is no. They use a novel architecture that involves a round trip from your test process (e.g. cucumber or something else using selenium-client) to saucelabs.com, where they run a SeleniumRC server and a browser, then back to your machine where your webserver and database are. Something like this:

Sauce Labs Architecture, Simple

Of course in the real world we all live behind firewalls, so actually the architecture is slightly more complicated. You setup a reverse SSH tunnel to a machine at Sauce Labs that the Sauce On Demand service then uses to tunnel to your webserver.

Sauce Labs Architecture, Detail

Of course inserting this roundtrip across the Internet is not without cost. The penalty of running Selenium tests this way is that they run quite slowly, typically about 3 times slower in my experience. This means that this is not something that individual developers are going to do from their workstations.

However, when you consider the issue of cross-browser testing, this starts to look a lot more attractive. During our continuous integration (CI) testing, we like to test against several different browsers. In the past this has meant complex setups involving virtual machines running Windows and other OSes. With Sauce', switching OS and browser is simply a matter of tweaking a parameter. For CI builds, speed is not of the utmost importance, so we are willing to trade the cost of maintaining complex and fragile build setups, for a speed penalty.

To ease the integration of this service into Rails projects, we have written a gem, saucelabs-adapter, that will do the necessary tunnel setup/teardown and selenium configuration, automatically as part of a Rails test. Currently it supports Test::Unit using Polonium or Webrat, and JsUnit tests. We haven't used it with an RSpec project yet, but I suspect that will be an easy integration.

Today we are open-sourcing the saucelabs-adapter gem:

Sam PiersonSam Pierson
Standup 9/24/2009: Hiring a Sysadmin, Rails Security Patches
edit Posted by Sam Pierson on Thursday September 24, 2009 at 01:40PM

Help

Interesting

  • Mouseophobics - Ctrl-Enter on the RubyMine Commit Changes dialog will commit without you having to grab your mouse.

  • Rails Security Patches - The Rails team recently came out with 2.3.4 featuring security patches to fix the recently discovered vulnerabilities. Apparently the plan was to also upgrade 2.2.2 to 2.2.3 with these patches, but they forgot to push the gems. They should be coming to a gem server near you soon.

Sam PiersonSam Pierson
Standup 9/23/2009: Multiple Rubygems Versions, Abstract AR Classes
edit Posted by Sam Pierson on Wednesday September 23, 2009 at 09:14AM

Help

How do you deal with an old, soon to be retired codebase that requires old versions of rubygems and rake, and a new codebase that requires new version of rubygems and rake, on the same machine?

Suggestions:

  1. Have two separate Ruby installations (each with its own gems).
  2. Don't. Use two machines. Optimize for developer resources rather than hardware, the former being much more expensive than the latter.

ActiveRecord Abstract Class doesn't work with validations?

AR allows you to set abstract_class = true. The makes the class uninstantiable, i.e. you can't create instance of the class, you have to create a subclass of it and create an instance of that.

However, if you create an abstract class that contains validations, then subclass it, the subclass produces errors when attempting to validate. This features does not seem well thought out.

Has anyone used Capistrano to deploy to a load balanced EC2 cluster?

Suggestions:

  • Follow the "deploy to localhost" path.
  • Use Cap for bootstrapping and Chef for configuration.

Sam PiersonSam Pierson
Standup 9/21/2009: Cabulous
edit Posted by Sam Pierson on Monday September 21, 2009 at 01:36PM

Interesting

  • Cabulous is going into beta: UpStart Mobile, makers of Find my Friend are releasing the beta of Cabulous: "A mobile application that gives cab drivers and their passengers the peace of mind of seeing exactly where each other are from hail to pick-up.". Now that's what I call handy. The Cabulous beta is being held in San Francisco, California in November/December 2009.

Slides

Cool Stuff

  • Rubymine (Fuzzy search added 4 days ago)
  • Rack
  • Metal
  • CacheMoney - write-thru caching - overcome replication lag.
  • Rails Templates - install plugins, do VCS stuff etc.
  • Metric Fu - Code analysis: Flay, Flog, Roodi, reek (code smell) & rcov
  • Rails.cache
  • Cucumber
  • FakeWeb - fake entire websites for testing
  • Spike - log analyser
  • Ultrasphinx - full text search
  • Sliding Stats - rack middleware
  • Clearance - authentication
  • Sprinkle - provisioning
  • Passenger Stack
  • Spree - shopping cart setup
  • Webrat - DSL for integration tests
  • Taps - migrate a database from one server to another

Sam PiersonSam Pierson
Railsconf: HTTP's Best-Kept Secret: Caching - Ryan Tomayko (Heroku)
edit Posted by Sam Pierson on Thursday May 07, 2009 at 05:18PM

HTTP's Best-Kept Secret: Caching Ryan Tomayko (Heroku)

About Ryan

  • http://tomayko.com
  • Sinatra maintainer.
  • Rack core team.
  • Creator and maintainer of Rack::Cache.

Http Caching?

  • NOT Rails Caching
  • HTTP caching headers in requests: Cache-control: If-Modified-Since: If-None-Match:
  • and responses: Cache-control: Last-Modified: ETag: Vary:
  • This stuff is defined in RFC2616, we won't be going into this that deeply.

Types of Cache

Client cache

  • Built into browsers and other types of client.
  • 1:1 relationship between cache and client. The cache only serves one client (private cache).
  • How much bandwidth does each cache save: can't beat it.

Shared Proxy Cache

  • Setup for an organization
  • 1:many relationship between cache and clients. Serves more than one client (shared cache).
  • Is closer to the client than the server, therefore saves a lot of bandwidth.

Gateway Cache

  • a.k.a. Reverse Proxy Cache
  • Situated inside of the origin site
  • 1:everyone relationship between cache and clients.
  • Reduces bandwidth the least.

Why cache?

  • The answer to this has changed over time.
  • In Nov 1990 there was 1 guy on the web - Tim Berners-Lee.
  • In Feb 1996 the web population was 20M. State of the art connectivity was a 28.8kbps modem. At that speed, loading the current http://yahoo.com (~350k) would take 2:48s. Bandwidth was the largest issue. RFC1945 HTTP 1.0 included the Expires: and Last-Modified: headers.
  • In March 1999 RFC2616 HTTP 1.1 was released. Addressed 1996 caching problems.
  • Today: we cache so we can scale. Keep your back-ends free from as much work as possible. Push as much work up the stack as possible.

HTTP 1.1 defines 2 caching models

Expiration

  • Back-end sets Cache-Control: public, max-age: 60
  • Gets cached in gateway cache an browser cache.
  • Public says it is good for many clients.
  • Cached for 60s.

Rails example

def show
  expires_in 60.seconds, :public -> true
  # stuff
  render ...
end

Sinatra example

headers['Cache-Control'] = 'public, max-age=60'

Validation (Conditional GET)

  • Back-end adds ETag or Last-modified, e.g. ETag: abcdef012345
  • Last-modified is redundant, basically there for HTTP 1.0 clients.
  • On 2nd request, gateway cache realizes it has this page in cache, then sends a GET /foo, Host: foo.com, If-None-Match: abcdef012345 to the back-end.
  • If back-end returns a 304 Not Modified, gateway cache returns cached version.

Rails example:

def show
  @foo = Foo.find(params[:id])
  fresh_when :etag => @foo,
  :last_modfied => @foo.updated_at.utc

Alternative idiom:

def show
  @foo = Foo.find(params[:id])
  modified = @foo.updated_at.utc
  if stale?(:etac => @foo, :last_modifed => modified)
    respond_to ...

Sinatra example:

get '/foo' do
  @foo = Foo.find(paramsp:id])
  etag @foo.etag
  erb :foo
end

Combine Expiration & Validation

  • Back-end sets Cache-control: public, max=age=60 and ETag: abcdef012345
  • In < 60 seconds, cache-control takes precedence
  • After 60 seconds, it queries back-end using ETag
  • Back end can then send back a 304 not modified with a new Cache-control: public, max-age: 60

Misc

  • Never Generate the Same Response Twice

Recommend using Rack:cache

gem install rack-cache

config.middlware.use Rack::Cache,
  :verbose          => true,
  :metatstore       => "fie:/var./cahe/rack/meta",
  :entitystore      => "file var/cache/rack/body",
  :allow_reload     => false,
  :allow_revalidate => false

The client controls what happens at the cache as well as the server using Cache-control. Refresh send Cache-control: no-cache. No-cache means gateway cache MUST revalidate ETag before sending response. This is bad and people can pound your back-end. :allow_reload => false disables this.

  • High-Performance Caches: Squid, Varnish (Heroku uses this)
  • Interesting discussion about ESI at the end.
  • Rails by default uses id of model, classname and last_updated to create an MD5 hash for etag.
  • Need to start with a seed that covers your release version, otherwise etag will not change. Rails now has a mechanism to handle this.
  • 2.3 branch has a new "touch" mechanism too.
  • Browser behavior differs and varies quite significantly when using SSL.

Slides are online already

Random nuggets from the talk:

The overhead of most requests is calls out of a framework to a DB, FS etc, but because it is called from the framework, that is what gets the blame. This sustains the myth that "<insert your framework of choice> doesn't scale". Solution: put a proxy in front of the server and duplicate the server behind it.

Types of proxy:

  • Transparent
  • Intercepting
  • Caching
  • ...

Transparent Cut-Through Proxy = 90% use case

  • Transparent Proxy - user cannot detect he is behind a proxy
  • Cut-Through - forwards on the fly (not store and forward)

The Problem

Flaws of Staging environments:

  • Any change in profile of queries invalidates your testing
  • Cost

The Solution

  • What if you could take your production traffic and fork it to two environments

EventMachine

  • EventMachine inplements a design pattern knows as the reactor pattern
  • Will connect to any file descriptor (e.g. a socket)
  • Written in C++ for high performance and concurrency without threads
  • EM does have a native thread pool used for EM.defer
  • http://bit.ly/aiderss-eventmachine excellent PDF to document EM

EM-Proxy

  • http://github.com/igrigorik/em-proxy
  • A simple DSL for writing proxy servers.
  • The return from on_data and on_response blocks is just passed on/back.
  • If you return nil from a block, no data gets forwarded.
  • 5% performance hit for large messages
  • 20% perforamnce hit if messages are very small, mitigate by putting behind HA proxy and add another server.
  • No way to send to only 1 back-end server yet (can't implement a load-balancing proxy).

Misc name-dropping

  • httpperf is really good for replaying traffic against a site
  • igrigorik/autoperf - replay nginx logs against your site
  • Recommended we look at MySQL proxy - awesome dashboard.
  • Nginx does really good things with compression (gzip, ETAGS etc).
  • Mailtrap is a fake SMTP server gem for testing sending email from your Rails app.
  • Defensio is a smap filter for blogs. API you can send comments to and it will tell you if it is spam or not. Returns a 'spam index'.
  • Beanstalk is an in-memory distributed message queue. Despite frequent requests, they have not implemented persistence, which is what motivated Ilya to work around them with this proxy server.

Other articles: