Pivotal Labs

Main menu

Skip to primary content
Skip to secondary content
  • About
  • Case Studies
  • Team
    • Executives
    • Locations
      • San Francisco (HQ)
      • Boston
      • Boulder
      • Denver
      • London
      • Los Angeles
      • New York
  • Community
    • Blogs
    • Tech Talks
    • Events
  • Careers
    • Lifestyle
    • Principles & Practices
    • Benefits
    • FAQ
    • Apply
  • Contact
    • Press Room
    • Press Releases
    • In The News
    • Press Kit
  • All
  • Labs
  • Standup
  • Tracker

RubyGems Warningitis Outbreak

Alex Chaffee
Thursday, May 12, 2011

Have you upgraded RubyGems lately? Is your console suddenly filled with warnings like this?

NOTE: Gem::Specification#default_executable= is deprecated with no replacement. It will be removed on or after 2011-10-01.
Gem::Specification#default_executable= called from /Users/chaffee/.rvm/gems/ruby-1.9.2-p0/specifications/thin-1.2.7.gemspec:10.

You may be showing signs of a new malady known as Warningitis! So far there is no cure, but doing the following will temporarily cure your symptoms:

gem update --system 1.7.2

Several experimental treatments are being hastily developed as well, but these have not yet been approved by the FDA. Check the “scary warnings are scary” bug thread for more details.

This has been a public health alert. Please do not panic. SARS masks and iodine pills are not recommended at this time.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

monkey patch of the day – activesupport vs. json_pure vs. Ruby 1.8

Alex Chaffee
Sunday, July 25, 2010

The error:

/Library/Ruby/Gems/1.8/gems/json_pure-1.4.3/lib/json/pure/generator.rb:232:in `__send__': undefined method `except' for #<JSON::Pure::Generator::State:0x102f245b0> (NoMethodError)

The environment: Ruby 1.8.7, DataMapper, dm-types, ActiveSupport, or just

require 'json/pure'
require 'active_support'

(as seen in http://gist.github.com/339528)

My solution:

# workaround for activesupport vs. json_pure vs. Ruby 1.8 glitch
if JSON.const_defined?(:Pure)
  class JSON::Pure::Generator::State
    include ActiveSupport::CoreExtensions::Hash::Except
  end
end
  • 0 Shares
  • Share on Facebook
  • Share on Twitter
Pivotal Labs

"Pivotal News Network" Highlights from May

Pivotal Labs
Wednesday, June 2, 2010

The Pivotal News Network has been going strong for six months (Pivots: talk to me if you’d like to share into the feed). Here are some highlights from May:

  • from Startup Product Development: Simple Now vs. Complex Later

When starting any software project, there’s an age old argument: should we build something simple that solves our current problem or should we use an existing product that’s more complex, but more feature rich, since we know that’s where we’re going to end up in the future?

…

an oft neglected repercussion of building too much too quickly is that the extra functionality can calcify your product and make it very rigid. Releases become more complex, new features take longer to implement and bugs take longer to fix. You can find yourself a prisoner of your product, maintaining functionality and features that no one ( or very few ) people use. It can demoralize a engineering team, making them more and more susceptible to the nuclear option: the big rewrite.

I think the tendency to lean towards a more exhaustive solution upfront comes from a time when the effort require to change software was much higher than it is today. When systems were written in C, C++, Perl or even Java, making changes was a large undertaking. The thought of possibly throwing away chunks of code was nerve racking. It represented a huge investment in time and money. However, with todays rapid development languages and frameworks like Ruby/Rails & Python/Django, the investment required to create something, both in time and money, is rapidly shrinking.

  • from Are you punching your users in the face?

Jeff [Patton]’s reply shocked me:

“The Ruby community cares about building high-quality apps, but doesn’t necessarily care about shipping high-value apps.”

Jeff went on to say that the Ruby community is obsessive about craftsmanship. This is a good thing, of course. We test. We write clean code. We take the time and care to build applications that are beautiful and do what our customers ask for.

Therein lies the rub: what customers ask for is rarely what they want, and almost never what they need. As Henry Ford put it, “If I had asked what people wanted, they would have said faster horses.” Or as I put it, your customer may pay you $1000 to deliver him a knuckle sandwich, but no amount of precision or strength training is going to leave you with a happy customer.

It turns out that constructing a high-quality application is not enough – you have to conceptualize and design an application that users will actually find useful. Doing this is every bit as difficult as constructing the software, if not harder. It requires a combination of research – generating new ideas from asking questions & identifying problems – and feedback – testing out ideas you’ve created. The Ruby & Agile worlds have been primarily focused on getting user feedback, without doing the all-important research.

  • from Agile Anti-Patterns: Democratic Design

Weeks ago, some people in the Ubuntu community got a bit disappointed with the distribution’s core team:

We are supposed to be a community, we all use Ubuntu and contribute
to it, and we deserve some respect regarding these kind of decisions.
We all make Ubuntu together, or is it a big lie?

We all make Ubuntu, but we do not all make all of it. In other words, we delegate well. We have a kernel team, and they make kernel decisions. You don’t get to make kernel decisions unless you’re in that kernel team. You can file bugs and comment, and engage, but you don’t get to second-guess their decisions. We have a security team. They get to make decisions about security. You don’t get to see a lot of what they see unless you’re on that team. We have processes to help make sure we’re doing a good job of delegation, but being an open community is not the same as saying everybody has a say in everything.

  • from Velocity as a Goal

    From my experience having velocity as a goal doesn’t make any difference to the motivation of the team which is often cited as the reason for referring to it as a target.
    In all the teams I’ve worked on people are giving their best effort anyway so they can only really have an impact on the velocity by doing one of the following:

    • Working longer hours
    • Cutting corners on quality (by less testing perhaps)
    • Finding a smarter way of working

    …
    In reality I haven’t noticed that people on the teams I’ve worked on pay that much attention to whether velocity is considered a target or not. People just do their job and we pretty much always have the same velocity each week regardless.

  • “If you split test every idea, you’ll end up with either porn or gambling.”

  • “Every time you convince a client to leave their Excel spreadsheets behind and use @PivotalTracker an angel gets his wings.”

More popular shared items:

  • 7 LESSONS LEARNED WHILE BUILDING REDDIT TO 270 MILLION PAGE VIEWS A MONTH

  • The Case for Git Rebase

  • “The best way to handover work is to leave a broken test for your colleague to fix.”

  • Nonblocking ActiveRecord & Rails

  • HTML5 vs Flash performance

  • How many lines of code does it take to create the Android OS?

  • The unimportance of product names (which got smacked down in the Google Reader comments)

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

Basic Ruby Webapp Performance Tuning (Rails or Sinatra)

Alex Chaffee
Wednesday, April 28, 2010

My company launched our app, Cohuman, a few weeks ago. The rush of finishing features, fixing bugs, and responding to user feedback has subsided a bit, and it’s time to go back and give the little baby a tune-up. I find that a good development process will ebb and flow, and as long as you don’t let something slide for too long, it’s perfectly acceptable to let bugs, or performance issues, or development chores pile up for a bit and then attack them concertedly for an entire day or two. A bug-fest or chore-fest or tuning-fest can actually increase efficiency as you get in a rhythm… and it feels really good at the end of the day when you see all the bugs you slayed or all the milliseconds you shaved.

In this article I’d like to describe some of my techniques. I make no claim of originality or great expertise; I just want to share what I know, and hear (in comments) what other people have learned. I’m using Sinatra and ActiveRecord, but not Rails; hopefully this discussion will help people no matter what framework they’re using.

Metrics and Logs

The first step, and often the most overlooked, is to gather metrics. Without knowing how it’s working now, how are you going to know what to improve? And how are you going to know whether you made things better or worse? Frequently I’ll make a change that I’m sure will improve performance, only to discover that it’s made no change, or helped in one place but hurt in another.

Where to begin? We’re using New Relic for live performance monitoring, so my decision of what to optimize was easy: I went to their Web Transactions panel and looked at the Most Time Consuming and Slowest Average Response Time reports. If you don’t have a flashing signpost like that, it’s easy enough to decide on a path to work on: either go with user reports, or click around your app and see what feels slow, or choose the most popular request (which is usually the home page).

I always pick a single path to work on, from request to controller to database to view, and work on the slowest parts. This demands more metrics! It’s a common mistake to jump in and start tuning the database when the view is actually taking twice as long. What’s the use of cutting the database access from 400 to 200 msec when the view is taking 1200 msec to render?

I also like to grab a copy of the production DB and bring it to my development machine so I can be sure I’m profiling real cases, and not being fooled by artifacts of generated data. We’re lucky that our app is currently small enough to do this; when the app gets bigger we’ll have to write a script that grabs only selected users’ data as a slice of the whole enchilada. (Note that there are some privacy concerns here: we are careful to only log in locally using our own accounts, and only to gather statistics in aggregate, not to look at details of user-entered data unless it’s to diagnose a specific user-reported issue or bug.)

Lots of in-app metrics tools exist (e.g. ruby-prof, benchmark), but I prefer the simple approach: I rolled my own Marker class that spits out basic msec timing information to the logs. In single-request performance tuning, what matters is relative timing between sections of code, so any objections to this technique on grounds of accuracy or detail are outweighed by its advantages: it’s simple, it shows where your bottlenecks are, and it divides the logs into sections so you can get a sense of who’s making what calls.

class Marker
  def self.mark(msg, logger = ActiveRecord::Base.logger)
    start = Time.now
    logger.info("#{start} --> starting #{msg} from #{caller[2]}:#{caller[1]}")
    result = yield
    finish = Time.now
    logger.info("#{finish} --< finished #{msg} --- #{"%2.3f sec" % (finish - start)}")
    result
  end
end

Usage is simple: pick a block you’re interested and wrap it in Marker.mark("foo") do...end. You can then scan the logs using “less” (or a text editor) and search for the name you gave the block. Marking your controller and your view is a natural place to start; later you can insert marks inside interesting blocks of domain code. In Sinatra, you can do something like this:

get '/foo/:id' do
  foo = Marker.mark("loading foo") do
    Foo.find(params[:id])
  end
  Marker.mark("rendering foo") do
    FooWidget.new(:foo => foo).to_s # Erector
  end
end

I’ve also got a nice little Rack middleware component that marks the time spent inside each request. Note here that you can put lots of fun information in the name that can be helpful for debugging.

class Marking
  def initialize(app)
    @app = app
  end

  def call(env)
    response = nil
    Marker.mark("#{env['REQUEST_METHOD']} #{env['SCRIPT_NAME']}#{env['PATH_INFO']}") do
      response = @app.call(env)
    end
    response
  end
end

Figuring out where a particular log message (especially a DB query) is coming from is essential. It’s important not to make assumptions. If you think you know where the call is coming from, put in a stack trace to make sure, and rerun the request to confirm. That’s why Marker is outputting caller — caller[0] is the code that names the mark, so you already know where that is; caller[1] is the line that called it, and caller[2] is the line that called caller[1]. If that’s not enough context, drop in a logger.info(caller.join(”nt”)) so you can scan the entire stack trace back up to the application code that you understand.

I’ve found that while ActiveRecord (2.3.5) tries to show where a result is coming from, it doesn’t always get it right, especially if you’re using plugins or gems that insert themselves into the call chain. So I monkey-patched AR to be a little smarter about its tracing:

module ActiveRecord
  module ConnectionAdapters
    class AbstractAdapter
      # strip library file pathnames from logged stack traces
      def log_info(sql, name, ms)
        if @logger && @logger.debug?
          c = caller.detect{|line| line !~ /(activerecord|active_support|__DELEGATION__|vendor|new_?relic)/i}
          c.gsub!("#{File.expand_path(File.dirname(RAILS_ROOT))}/", '') if defined?(RAILS_ROOT)
          name = '%s (%.1fms) %s' % [name || 'SQL', ms, c]
          @logger.debug(format_log_entry(name, sql.squeeze(' ')))
        end
      end
    end
  end
end

All this leads to log entries that look like this:

Thu Apr 15 11:09:17 -0700 2010 --> starting GET /app from /Users/cohumancomputer27inmac/dev/cohuman/lib/query_caching.rb:15:in `call':/Library/Ruby/Gems/1.8/gems/activerecord-2.3.5/lib/active_record/connection_adapters/abstract/query_cache.rb:34:in `cache'
  User Load (0.8ms) domain/user.rb:362:in `authenticate_from_login_token'   SELECT * FROM "users" WHERE ("users"."login_token" = E'abc123xyz') LIMIT 1
Thu Apr 15 11:09:17 -0700 2010 --> starting rendering ApplicationPage from /Users/cohumancomputer27inmac/dev/cohuman/controllers/app_controller.rb:4:in `GET /app':/Library/Ruby/Gems/1.8/gems/sinatra-0.9.4/lib/sinatra/base.rb:779:in `call'  Project Load (13.4ms) domain/user.rb:146:in `projects'   SELECT "projects".* FROM "projects" INNER JOIN "memberships" ON "projects".id = "memberships".project_id WHERE (("memberships".user_id = 2))
  User Load (3.2ms) domain/user.rb:135:in `coprojectmates'   SELECT "users".* FROM "users" INNER JOIN "memberships" ON memberships.user_id = users.id WHERE (memberships.project_id in (4,129,122,1,66,82,102,684,533,139,3,155,624,106,90,394,399,153) AND memberships.user_id != 2)   Email Load (2.1ms) domain/user.rb:135:in `coprojectmates'   SELECT "emails".* FROM "emails" WHERE ("emails".user_id IN (1,3,5,7,8,11,12,9,6,14,15,16,17,18,22,27,26,35,45,32,30,79,37,109,80,504,507,508,39,165,521,725,727,729,730,731,734,735,736,105,28,58,240,381,51,40,36,785,834,839,844,847,850,842,840,841,843,889))
  User Load (11.7ms) domain/user.rb:128:in `cohumans'   SELECT "users".* FROM "users" INNER JOIN "cohumanities" ON "users".id = "cohumanities".cohuman_id WHERE (("cohumanities".actor_id = 2))
  Email Load (19.4ms) domain/user.rb:128:in `cohumans'   SELECT "emails".* FROM "emails" WHERE ("emails".user_id IN (1,6,108,8,22,35,509,852,853,854,862,864,866,3,895,896,897,929,930,931,30,944,165,827,976,977,978,735,1024,2003,2004,59))
  SQL (0.5ms) domain/user.rb:177:in `temporary?'   SELECT count(*) AS count_all FROM "emails" WHERE ("emails".user_id = 2)
  Email Load (0.3ms) domain/user.rb:173:in `verified?'   SELECT * FROM "emails" WHERE ("emails".user_id = 2)
Thu Apr 15 11:09:17 -0700 2010 --< finished rendering ApplicationPage --- 0.307 sec
Thu Apr 15 11:09:17 -0700 2010 --< finished GET /app --- 0.311 sec

I know it can look daunting, but when scanning logs, it’s important to keep a clear head. Let’s examine this little burst of gibberish and try to make sense of it.

Line 1 says “–> starting GET /app” which means that the user has made a GET request for our main URL. We can skip ahead (search for “–< GET /app”) and see that the entire request took 0.311 seconds. This isn’t bad, but it could be better.

Line 3 says “–> starting rendering ApplicationPage” which means that all the other queries are happening from inside the rendering view code.

Note that the database queries are only taking 49.3 msec out of 311 msec, which means 84% of the time is spent either processing DB results or rendering them. This request is probably not a good candidate for DB-level tuning.

(How’d I add up all those scary milliseconds without an abacus? Piped my log text into this bad boy:

  ruby -e 'x = 0; STDIN.each do |line| if line =~ /(([0-9.]*)ms)/; then x += $1.to_f; end; end; puts x'

)

Indexes

Most (if not all) databases add an index for the primary key of a table. But a quick scan of the database logs will show many fields that are used in queries, and chances are you haven’t added indexes for them. (In fact, you probably shouldn’t add an index for a field until it shows up in the logs, since indexing slows down writes and takes up extra disk space. Not a lot, but it might add up.) In the above example, look at the User Load — every time a user hits the site we check to see if he’s logged in by querying the database for his login cookie. Adding an index for the “login_token” field in the users table sped up this query by a factor of 10. (Yes, that violates my “don’t fix what ain’t slow” dictum, since going from 10 ms to 1 ms isn’t really fixing much, but I figure it adds up over time since it happens on every single app request.)

Avoidance

The only perfect program is the one with zero lines of code. And the fastest code is that which is not run.

Sometimes you can optimize a section of code by removing unnecessary calls from your app layer. One nice trick these days is to move stuff behind an Ajax call. In Cohuman, we do this with some of our tabs: if you switch to a tab, and it hasn’t been loaded yet, it shows a spinny and starts an Ajax call to load it in. As long as we can keep each Ajax call under a second in length, the user-perceived delay is negligible.

Query Caching

ActiveRecord maintains a query cache, so if you run the same query (and I mean the same SQL), it won’t hit the database again. But if you’re not using Rails, query caching is disabled by default. So I wrote yet another Rack middleware so I don’t have to remember to wrap all my controllers in a ActiveRecord::Base.cache do block:

# a Rack middleware component that enables ActiveRecord query caching
# To use, put "use QueryCaching" in your Sinatra app.

class QueryCaching
  def initialize(app)
    @app = app
  end

  def call(env)
    if is_static_file?(env)
      @app.call(env)
    else
      response = nil
      ActiveRecord::Base.cache do
        response = @app.call(env)
      end
      response
    end
  end

  def is_static_file?(env)
     # if the path end with a dot-extension (e.g. 'foo.jpg') then we assume
     # it's a static file and don't enable the query cache. (This will only
     # work for some application URL schemes, naturally.)
    env['PATH_INFO'] =~ //[^/]*.[^/.]+$/
  end

end

Note that this is a query cache, not an object cache (see below).

Query Tuning

Once you’ve identified some troublesome queries, you need to decide how to optimize them. You’ve basically got two choices here; which to choose should be obvious from the logs. Are there many low-latency queries, or a few high-latency queries? High-latency queries are an obvious target, and you should do your best (with indexes and SQL) to cut them down to size, but don’t let them distract you. There are two hidden costs to low-latency queries:

They actually take longer than they say they do – the AR log line only displays the time for the database connector to return the raw data. It doesn’t show the time to create AR instances, build association “classes” (which takes an annoyingly long time, since all their methods are built on the fly for each instance), and run post-load initialization code. (I just did a little experiment loading ~3000 of our User objects, which have a fair number of associations; SELECT * FROM users took 21 msec but User.all took 547 msec. That’s about 25x as long!)

They stack up, and I’m not talking pancakes – chances are you’ve got a lot of webapp processes hitting a single database (or a small number of slaves). As traffic increases, the queries will stack up like airplanes requesting permission to land. At a certain point you’ll hit a cliff (sorry for the mixed metaphor — it’s not fun to imagine a plane hitting a cliff) and per-request latency will rise dramatically. Lowering the number of queries per web request will, um, raise the ceiling? Lengthen the runway? Lower the cliff? Anyway, it’ll make this problem, uh, less worse. It’s kind of counterintuitive, but the limiting factor for modern webapps is really the number of queries, not the amount of data returned by each query.

ActiveRecord associations (like has_many and belongs_to) are great for getting an app up and running, but as you peruse your logs you’ll notice some things they’re doing that aren’t very efficient. Our app loads a lot of objects, each of which has lots of associated objects, some of which associate to other objects. If we’re displaying a list of Users, and each user has associated Emails (via has_many :emails), and we want to render a list of users and their email addresses, we’ll probably see one query that loads all users, and then one query for each user loading his or her emails.

Adding an :include to the declaration is a good way to reduce these from N+1 to 2, but it doesn’t always work. I have never been able to comprehend AR’s alien logic, so my logs are often littered with queries despite my best efforts fiddling with the association declaration. Furthermore, AR is quite naive about object graphs: for example, user.emails.first.user will make an extra query and return a different user than the one you started with, even though they have the same id and you loaded the emails via :include.

So I’ve gotten good performance boosts by moving away from ActiveRecord and doing some nested queries by hand. Not by writing literal SQL, but by doing one query, extracting the necessary ids, and then doing the next query, and saving or plugging in values directly. This led naturally to Treasury (see below).

Sometimes, of course, writing SQL is unavoidable; fortunately, AR allows it, and there are many people who are much better at that than I, so I won’t embarrass myself by discussing it further here.

Object Caching and the Repository Pattern

During my first pass at tuning Cohuman several months ago, I took a little time to write a library that implements a Repository Pattern. The Treasury is a work in progress that sits in front of ActiveRecord (and eventually, other ORMs) and caches object instances as they pass through. If you then request an object via the Treasury, it will check in its cache and return a pointer to the existing object instead of making a query; if you specify a list of ids, then it will only query for the ones it doesn’t yet have. (There are other features I won’t go into here, including a DSL for building queries… expect an upcoming article to officially introduce Treasury to the world.)

I’ve heard that DataMapper has an object cache, but I haven’t yet dug into the details of how it works, so I don’t know if Treasury is redundant with it, or if it would make sense to plug in DM behind it. (I’ve also heard it solves the N+1 query problem gracefully. Anyone want to proselytize DM in the comments?)

All the caches I’ve mentioned only persist within a single request. This is probably a good thing, since allowing instances to persist between requests would open a can of data integrity, thread safety, and multi-host worms. But I can’t shake this vision I have of a sort of in-process memcache for Ruby objects, where multiple processes communicate changes to each other via TCP wormholes… Does anyone else share this vision, or am I doomed to wander the Ruby blog desert, mumbling incoherently at strangers?

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

UTC vs Ruby, ActiveRecord, Sinatra, Heroku and Postgres

Alex Chaffee
Friday, January 22, 2010

Now that I’m starting to use DelayedJob to perform jobs in the future in my Heroku Sinatra app, its important that they happen at the scheduled time. But unless you pay attention, you’ll find that times get mysteriously changed — in my case, since I’m in San Francisco in the wintertime, by +/-8 hours — which means that some conversion to or from UTC is being attempted, but it’s only working halfway.

Trying to keep a handle on which libraries are attempting, and which are failing, to convert times is a losing battle, so I’m trying to do the right thing and save all my times in the database in UTC, and convert them to and from the user’s local time as close to the UI as possible. Unfortunately, a variety of gotchas in Ruby and ActiveRecord and PostgreSQL makes this trickier than it should be. Here’s a little catalog of my workarounds.


You must set both Time.zone = "UTC" and ActiveRecord::Base.default_timezone = :utc. Since I’m using Sinatra, not Rails, this stuff goes either in main (i.e. not inside any class) right after require 'active_record', or in a configure block in your app, depending on your preference.


When ActiveRecord creates queries — which are used for both reading and writing, mind you — it will only convert to UTC times that are instances of ActiveSupport’s proprietary TimeWithZone class. It will not convert regular Ruby Time objects, even though Time objects are perfectly aware of their time zones, and AR is perfectly aware that you’d prefer they be written as UTC (due to the default_timezone setting). This is clearly a bug IMHO, but the Rails core marked the bug as “will not fix”, so w/e. Here’s a monkey patch, courtesy of Peter Marklund:

  module ActiveRecord
    module ConnectionAdapters # :nodoc:
      module Quoting
        # Convert dates and times to UTC so that the following two will be equivalent:
        # Event.all(:conditions => ["start_time > ?", Time.zone.now])
        # Event.all(:conditions => ["start_time > ?", Time.now])
        def quoted_date(value)
          value.respond_to?(:utc) ? value.utc.to_s(:db) : value.to_s(:db)
        end
      end
    end
  end

When outputting timestamps to a UI — either inside HTML or in a JSON API — you’ll probably want to use Time#strftime. Beware: on Mac OS X under Ruby 1.8, the %z (lowercase Z) selector will emit the local time zone, not the zone of the Time object you’ve called strftime on. The solution is to either use %Z (capital Z) or just a plain Z which stands for Zulu Time. The latter is OK if you know you’re using UTC, which, if you’ve followed my advice, you probably do. This is a pretty annoying issue, since it’s much safer to use %z‘s hour offsets than %Z‘s three-letter codes, since the three-letter codes can be ambiguous, and in any case require an extra conversion to time offset, so you may as well just emit the offset.

Here are some methods on Time you may want to use that work around this %z issue:

  # Note: do NOT call this file 'time.rb' :-D

  require 'time'

  class Time
    def full_date_and_time
      strftime('%Y-%m-%d %H:%M:%S %Z')
    end

    def iso8601
      strftime('%Y-%m-%dT%H:%M:%SZ') # the final "Z" means "Zulu time" which is ok since we're now doing all times in UTC
    end
  end

That iso8601 method comes in really handy when you’re using the excellent timeago jQuery plugin by Ryan McGeary (@rmm5t).


By default PostgreSQL saves timestamps sans time zone, which means that ActiveRecord interprets them as being in the default_timezone. If you want to be extra clear and save them with time zone, you’ll have to change the Postgres adapter’s type mapping. ActiveRecord doesn’t let you configure this but here’s a monkey patch, courtesy of
Chirag Patel (with a couple of mods):

    require 'active_record/connection_adapters/postgresql_adapter'
    class ActiveRecord::ConnectionAdapters::PostgreSQLAdapter < ActiveRecord::ConnectionAdapters::AbstractAdapter
      def native_database_types
        {
          :primary_key => "serial primary key".freeze,
          :string      => { :name => "character varying", :limit => 255 },
          :text        => { :name => "text" },
          :integer     => { :name => "integer" },
          :float       => { :name => "float" },
          :decimal     => { :name => "decimal" },
          :datetime    => { :name => "timestamp with time zone" },
          :timestamp   => { :name => "timestamp with time zone" },
          :time        => { :name => "time" },
          :date        => { :name => "date" },
          :binary      => { :name => "bytea" },
          :boolean     => { :name => "boolean" }
        }
      end
    end

It turned out that I didn’t need this, so I ended up commenting it out. It may be that storing timestamps with time zones will cause a hiccup with some other random DB code, so watch out. If you do use it, and you’ve already got some data, make sure to write a migration that changes the types of all extant datetime and timestamp fields, and maybe a migration that shifts the times too.


That’s all I’ve got for right now. I’m sure some more problems will come up on March 14, 2010…

  • 0 Shares
  • Share on Facebook
  • Share on Twitter
Pivotal Labs

Standup 1/4: XSS Galore

Pivotal Labs
Tuesday, January 5, 2010
  • XSS #1: There’s a huge cross-site scripting hole if you use the meta refresh tag…it has a “data” attribute into which you can insert arbitrary javascript.

  • XSS #2: Cross-site scripting resources, from an internal mailing list:

    • “I’ve gained a new appreciation for the importance of carefully thinking through security and escaping in RoR there’s more than just h()’ing all your user entered data.”

    • XSS vulnerabilities – http://ha.ckers.org/xss.html.
      Very useful catalog of different XSS vectors. Includes some utilities to base64-, URL- and hex- encode attacks so you can test out your apps.

    • General OWASP wiki – http://www.owasp.org/index.php/Main_Page. Lots of useful data information here. OWASP is a nonprofit group charted to improve the security of webapps in general.

    • Security Guide for RoR -
      http://www.lulu.com/product/download/owasp-ruby-on-rails-security-guide/4489819
      general guidelines/things to think about for securing RoR apps.

    • Loofah – http://github.com/flavorjones/loofah is supported by a fellow Pivot and provides fast and good sanitization built on Nokogiri, albeit slightly slower on short strings than brittle regular expressions. It’s in production at several companies.

      “Loofah excels at HTML sanitization (XSS prevention). It includes some nice HTML sanitizers, which are based on HTML5lib’s whitelist, so it most likely won’t make your codes less secure.”

  • Happy New Year

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

annotate 2.4.0 released

Alex Chaffee
Sunday, December 13, 2009

Remember the annotate_models rake task? Dave Thomas wrote it many years ago and it corrects one of the flaws in ActiveRecord: it describes the schema for a table as a comment inside the Ruby model file that it maps to. Unfortunately Dave hasn’t had time to maintain it, so a couple of years ago I cleaned up some bugs and re-published it as a pastie. Then Cuong Tran made it a gem and put it on Github, and since then, there’s been a whole lotta forkin’ goin’ on!

I recently pulled in a bunch of the forks into ctran’s master branch, and just pushed it to Gemcutter as version 2.4.0. Just run gem sources and make sure http://gemcutter.org is in your list — otherwise do gem source -a http://gemcutter.org — and sudo gem install annotate and it’ll install a binary called annotate in /usr/bin. See the README on github for more info and have fun!

One caveat: ImageMagick installs a tool called annotate too (if you’re using MacPorts it’s in /opt/local/bin/annotate). So if you see

Usage: annotate imagein.jpg imageout.jpg

then put /usr/bin ahead on the path and you’ll get ours instead.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter
Pivotal Labs

Announcing the "Pivotal News Network" RSS Feed

Pivotal Labs
Saturday, November 14, 2009

We’ve pooled some Pivot shared tech news feeds and made this feedburner feed:

http://feeds.feedburner.com/pivotal-news-network

The content is in the spirit of Blabs, so we hope readers here might find it to be useful. See what you think.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

Why Wouldn't You Use Erector?

Alex Chaffee
Thursday, October 8, 2009

No, seriously. Why wouldn’t you use Erector? Cause I think it’s a pretty awesome view framework, but for some reason it hasn’t caught fire yet. So if you think writing actual Ruby to emit HTML, with a clean, nestable syntax with full support for Ruby features like inheritance, delegation, and yield is neat, but there’s something holding you back, then please let us know what it is. At best we can fix it, and at worst, at least we’ll know why.

Here are some reasons I think you might not use Erector:

You love angle brackets. If this is the case then I can’t help you. I don’t think anybody can.

You like typing every tag name twice. Since Erector elements are Ruby statements, every open tag gets automatically closed.

You like invalid HTML. Since Erector elements are Ruby statements, every open tag gets automatically closed. (See how that works?)

You always remember to call ‘h’. Rails 3.0 is going to HTML-escape all output by default. Erector’s been doing this the whole time. Cause, you know, why wouldn’t you?

You like having to rewrite your code when you extract a partial, and then again when you extract a helper method. In ERB, templates, partials, and helpers all have slightly (and annoyingly) different syntax for things like referring to variables and calling other code. Erector is all Ruby, so you can use your favorite refactoring browser, or just cut and paste, to move your code around. Check out this excerpt from Jeff Dean’s RailsConf talk to see this in action, or read the slides from the whole talk on SlideShare.

You hate encapsulation. You think that your views should have direct access to all the instance variables of your controller. Unless they’re partials, in which case you shouldn’t, even though you can, although the names might be different. Confused yet? So am I.

You like putting code for one component in three separate files. Erector’s new “externals” feature allows you to put all the code — HTML, CSS, and JavaScript — inside a single Ruby class. The CSS and JavaScript then get output inside the HEAD, once per HTML page, while the HTML gets rendered in the BODY as usual, as many times as necessary. This follows the OO paradigm of cohesion, otherwise known as “put similar stuff together,” which is the complement of loose coupling, which means, “keep different stuff apart.”


Okay, so those were sarcastic reasons. Here are some more possible reasons why you wouldn’t use Erector. I suspect that these next ones hit closer to the mark. But I believe that they’re all specious, if not downright false.

Your site contains a whole lot of complex HTML and a few inserted Ruby variables. OK, this makes sense. Erector’s not great for static sites. But I’ve never personally worked on a web application where the code inside the views didn’t quickly get complex enough to require codey things like loops and functions. And if you’re writing code, then why not do it in a programming language?

Your designers don’t know Ruby. I’ve heard this complaint a lot, but I have yet to meet this mythical designer who’s smart enough to understand modern HTML, CSS, JavaScript, ERB, and partials, but is not smart enough to learn that “div ‘foo’, :class=>’bar’” outputs “<div class=’bar’>foo</div>”. On the contrary, I’ve worked with several designers who, after a few tutorial pairing sessions, were comfortable checking code in and out and editing Erector view code at will. Like any junior coder, they need to stay away from the tough stuff, but they’re pretty good at knowing what they don’t know and asking for help when they need it. (Which they would also do if working inside ERB.)

View code needs to look as similar to HTML as possible. Well, I hear this, but have you looked at HAML? That language is hella popular, and it doesn’t look anything like HTML. Its structure is similar, in the abstract, but so’s Erector’s, and at least in Erector the method for emitting a div is called, you know, “div”. And it’s a method. And I don’t want to turn this into a war between HAML and Erector — I think HAML is gorgeous — but HAML suffers from the same design flaw as every templating technology: views are not objects, and markup isn’t code. After a certain point of complexity, HAML’s elegance breaks down and you’d be better off doing loops and functions in code.

You’ve already got a bunch of stuff in ERB and it’d take too long to convert it. Yes, legacy code is a pain, but we have a command-line tool that converts ERB (or HTML) to Erector to make it a bit smoother. And you don’t have to convert your whole app to Erector at once. Erector views can interoperate with ERB or HAML in Rails and Sinatra.

You’re stuck on an old version of Erector. Yes, legacy code is a pain, but we have an upgrade guide for getting to 0.6.0, and people on the mailing list ready to help.

Erector’s too slow. Lies! Erector is faster than a greased rattlesnake going downhill. Check out these benchmarking results. Erector is about 2x as fast as ERB and 4x as fast as HAML about the same speed as ERB and HAML(*) under typical conditions. We make sure to use the same output stream to minimize string copy or realloc, and using Ruby objects means much lower parsing overhead.

(*) Update: the “2x/4x” figure was based on a benchmark program that didn’t use template caching, which speeds things up for both ERB and Haml. With template caching, Erector and Haml are about the same speed; Haml is about 20% faster when rendering a page with no partials. See this ongoing thread on the Erector list.

There’s no documentation. More lies! We have a whole bunch of documentation at http://erector.rubyforge.org, including a FAQ and a user guide.

You got burned by Markaby. Underneath the elegant facade of Markaby lay a confusing and often counter-intuitive engine. Its use of instance_eval and other tricks made simple things break in weird ways and made debugging a real chore. Erector was born out of those frustrations, and one of its main design goals is “no magic.” Also, there was a long time where Markaby wasn’t being maintained (although that’s changed recently); we have a core group of developers committed to responding on the mailing list and github, and we run integration tests against the latest stable Rails release (and soon, against Edge) to catch incompatibilities early on.

Rails has all these great helpers and I want to keep using them. Okay, go right ahead! Erector’s Rails integration allows you to call any helper, either directly through a proxy method, or indirectly through the helpers object. If you find a helper that doesn’t work, let us know and we’ll add it to the list of supported helpers. (We haven’t done all of them yet because it’s a pain in the neck to look at each function and figure out what its input and output semantics are. Does it return a string or emit directly onto the output stream? Does it take a block? An options hash? An html_options hash? Etc.) We’re also slowly putting some Rails functionality into Erector, either in the base class or in custom widgets. If there’s something you need, ask on the mailing list, or better yet, send us a patch.

Its name is a dirty word. I’ve heard this more from people who didn’t grow up in the United States, where the Erector Set was a popular toy among the 6-to-12-year-old DIY set in the 70s and 80s. (Apparently it was called Meccano in the UK.)
Erector is a normal word, used all the time in the news and in business names. And as the name of a view library it’s evocative in a way that’s relevant and interesting, in that it’s a builder, and you build a view up out of parts.

But we have heard this complaint, and in sympathy, changed the name of the command-line tool (oh, sorry, guess I can’t say “tool” either)– uh, executable– from “erect” to “erector” even though the former is a venerable English verb that’s grammatically appropriate (”I asked him to erect the scaffolding.”). If you introduce the library and your coworkers get all giggly then I think if you just say the name with a straight face and then roll your eyes and mock your bawdy buddies when they snicker then all will be well. After a few repetitions it won’t sound odd at all.

You’ve never heard of it. Help spread the word! Post a review on your blog! Ask your favorite app framework whether they support it! Post code samples in Erector and when people say “What’s that?” then point them at http://erector.rubyforge.org! Give a talk at a meetup! Write your congressman and ask if she supports the Erector Mandate Bill of 2009! Buy ad space on the moon!


So, in conclusion, and despite my somewhat snarky tone throughout, I am honestly and desperately curious to know why the world has not yet beat a path to Erector’s door. Anybody got any more ideas?

  • 0 Shares
  • Share on Facebook
  • Share on Twitter
Joe Moore

Testing Desert Plugins in Isolation

Joe Moore
Saturday, August 22, 2009

At Pivotal, some of our client projects use plugins from our home-grown social networking platform and rely on Desert to tie them all together. To test this package of plugins we created a project that contains all of our Desert plugins and wrote some rake tasks that run all of their tests. Great, right?

Mostly. We want to ensure that our plugins have the absolute minimum dependencies to function. Let’s pretend we have an UserAuth plugin and a SocialPivots plugin, where UserAuth has no dependencies, but SocialPivots depends on UserAuth. We would like to test these the to plugins in isolation. But, with Desert doing it’s job so well, our UserAuth plugin could have a dependency on the SocialPivots plugins’ models or tables and we would never know it. Everything from SocialPivots is mixed-in and loaded into memory, and all of its migrations have executed, at the time we are running UserAuth’s tests.

What we need is a way to tell Desert to load only the plugin under test, plus its dependencies listed in init.rb. Hacking Desert and Rails to allow us to specify which plugins to load turned out to be pretty easy. Check it out (full gist here):

Here, we override plugin loading:

# lib/plugin_dependency_limiter.rb
class Rails::Initializer
  def load_plugins
    # Only load the plugin under test
    Rails::Plugin.new("vendor/plugins/#{ENV['PLUGIN']}").load(self)
  end

  def add_plugin_load_paths
    # Do nothing.  We'll handle plugin loading ourselves.
  end
end

We might have many plugins in vendor/plugins but we only want to test our Desert plugins. We list them in config/plugins/plugins_to_test.yml and keep track of them here:

# also in lib/plugin_dependency_limiter.rb
class Rails::Plugin
  def self.plugins_of_interest
    # Keep track of the Desert plugins we care about
    @of_interest ||= YAML.load_file(RAILS_ROOT + "/config/plugins/plugins_to_test.yml").collect
  end

  def self.tracked_plugins
    @tracked_plugins ||= []
  end

  def require_plugin_with_plugin_tracking(plugin_name)
    self.class.tracked_plugins << plugin_name if self.class.plugins_of_interest.include?(plugin_name)
    require_plugin_without_plugin_tracking(plugin_name)
  end
  alias_method_chain :require_plugin, :plugin_tracking
end

Once you can control which plugins are loaded you can expand this to dictate which routes and migrations should be run:

Routes:

# config/routes.rb
ActionController::Routing::Routes.draw do |map|
  if ENV['PLUGIN']
    Rails::Plugin.tracked_plugins.each do |plugin|
      map.routes_from_plugin(plugin)
    end
    map.routes_from_plugin(ENV['PLUGIN'])
  end
end

Migrations:

# db/migration/001_migrate_desert_plugins.rb
class MigrateDesertPlugins < ActiveRecord::Migration
  def self.up
    Rails::Plugin.tracked_plugins.each do |plugin|
      migrate_plugin(plugin, nil)
    end
    migrate_plugin(ENV['PLUGIN'], nil)
  end

  def self.down
    raise ActiveRecord::IrreversibleMigration
  end
end

Let’s run some tests! Though we don’t provide the rake task here, it runs db:drop db:create db:migrate db:test:prepare before running the plugin tests.

First, the UserAuth plugin that has no dependencies:

$ rake testspec:plugins PLUGIN=user_auth
(in /Users/pivotal/workspace/desert_plugins)
==  MigrateDesertPlugins: migrating ===========================================
==  CreateUserAuthStuff: migrating ====================================================
-- create_table(:user_auth_stuff)
   -> 0.0023s
==  CreateUserAuthStuff: migrated (0.0026s) ===========================================

==  MigrateDesertPlugins: migrated (0.0076s) ==================================

/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/bin/ruby -I"lib:test" "/Library/Ruby/Gems/1.8/gems/rake-0.8.7/lib/rake/rake_test_loader.rb"
....................................

Finished in 0.494227 seconds

36 examples, 0 failures

Now the SocialPivots Plugin, which relies on UserAuth:

$ rake testspec:plugins PLUGIN=social_pivots
(in /Users/pivotal/workspace/desert_plugins)
==  MigrateDesertPlugins: migrating ===========================================
# Look, the migrations from UserAuth!
# Look, the migrations from UserAuth!
# Look, the migrations from UserAuth!
==  CreateUserAuthStuff: migrating ====================================================
-- create_table(:user_auth_stuff)
   -> 0.0201s
==  CreateUserAuthStuff: migrated (0.0204s) ===========================================

==  CreateSocialPivotStuff: migrating ==============================================
-- create_table(:social_pivot_stuff)
   -> 0.0034s
==  CreateSocialPivotStuff: migrated (0.0036s) =====================================

==  MigrateDesertPlugins: migrated (0.0332s) ==================================

/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/bin/ruby -I"lib:test" "/Library/Ruby/Gems/1.8/gems/rake-0.8.7/lib/rake/rake_test_loader.rb"
........................................................................

Finished in 1.381141 seconds

72 examples, 0 failures

I hope this code makes testing Desert plugins easier. Thanks to Pivots Adam and Edward for pairing with me through these solutions.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

Topics

  • agile (780)
  • rails (113)
  • testing (88)
  • ruby (83)
  • ruby on rails (70)
  • jobs (62)
  • javascript (55)
  • techtalk (44)
  • rspec (38)
  • ironblogger (32)
  • productivity (30)
  • activerecord (29)
  • gogaruco (29)
  • git (28)
  • nyc (27)
  • rubymine (26)
  • bloggerdome (23)
  • mobile (22)
  • process (21)
  • pivotal tracker (20)
  • cucumber (20)
  • jasmine (19)
  • design (18)
  • ios (18)
  • webos (17)
  • objective-c (17)
  • android (16)
  • palm (16)
  • "soft" ware (16)
  • fun (15)
  • tracker ecosystem (15)
  • ci (15)
  • cedar (15)
  • rails3 (14)
  • performance (14)
  • bdd (14)
  • gem (13)
  • css (13)
  • tdd (13)
  • selenium (12)
  • goruco (12)
  • bundler (12)
  • meetup (11)
  • railsconf (11)
  • nyc-standup (11)
  • capybara (10)
  • mac (10)
  • mojo (10)
  • chef (10)
  • api (10)
Subscribe to ruby on rails Feed
  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. →
  • About
  • Case Studies
  • Team
  • Community
  • Careers
  • Contact
  • Labs
  • Events

Contact Us

contact@pivotallabs.com
+1 415-77-PIVOT
TwitterLinkedInFacebook

Pivotal Tracker

Tracker is the award-winning agile project management tool that enables real-time collaboration around a shared, prioritized backlog.
Visit pivotaltracker.com >