Pivotal Labs

Main menu

Skip to primary content
Skip to secondary content
  • About
  • Case Studies
  • Team
    • Executives
    • Locations
      • San Francisco (HQ)
      • Boston
      • Boulder
      • Denver
      • London
      • Los Angeles
      • New York
  • Community
    • Blogs
    • Tech Talks
    • Events
  • Careers
    • Lifestyle
    • Principles & Practices
    • Benefits
    • FAQ
    • Apply
  • Contact
    • Press Room
    • Press Releases
    • In The News
    • Press Kit
  • All
  • Labs
  • Standup
  • Tracker

Stop leaky APIs

Robbie Clutton
Wednesday, May 15, 2013

There are many blogs about how to expose an API for a Rails application and many times I look at this and am concerned about how these examples often leak the application design and the schema out through the API. When this leak occurs a change to the application internals can ripple out and break clients of an API, or force applications to namespace URI paths which I feel is unnecessary and ugly.

When the only consumer of application data models are the views within the same application then the object design can be fluid and malleable. Once an application exposes an API to more than one client, and especially if that client is on a different release cycle to the server, such as iPhone application, data models become rigid. Rails tends discouraged N-tier architecture to the benefit of development speed but APIs are contracts between a server and it’s client and can be difficult to change once they start being used.

Passing an object into the Rails JSON serialisation methods will work for a time, but relying on this will only get you so far. At some point a refactor will take place that will cause a breaking change. It could be something simple such as renaming a column, moving responsibilities from one class to another or adding extra meta-data to a response. Either way, adding this information into your model class starts to place more responsibilities into one place.

There are a few ways out of this potential issue. Let’s take a look at the classic blog application and its Post object. The Rails rendering engine will call as_json on an object if the request has sent the content-type of application\json to the server. Here we override the implementation from ActiveRecord to provide a stable, known version:

def as_json(options={})
    {
        author_id: author.id
        title: title
    }
end

A second option is to model the object explicitly and serialise the internal model into a public representation. We can duck-type the object to respond how ActiveRecord objects behave during a serialisation call. Although this can be seen as a step towards a N-tier architecture, it’s also a step towards service dependent abstraction:

class Api::Post
  attr_reader :post

  def initialize(post)
    @post = post
  end

  def as_json(options={})
    {
      author_id: post.author.id
      title: post.title
    }
  end
end

The benefit of doing this is a separation of concerns between your data model and the data presentation. An application model doesn’t need to know how it’ll be represented by an API, command line interface or any other outside communication mechanism. If an application were tending more towards HATEOAS for instance this separation could help resolve hyperlinks relevant to the interface. You may lose some of the Rails respond_with goodness with this:

respond_to :html, :json

def show
  post = Post.find(params[:id])
  respond_to |format| do
    format.html { @post = post }
    format.json { render json: Api::Post.new(post) }
  end
end

That can be regained with the help of a presenter:

respond_to :html, :json

def show
  post = Post.find(params[:id])
  @presenter = PostPresenter.new(post)
  respond_with @presenter
end

Where PostPresenter may look something like:

class PostPresenter < SimpleDelegator
  def as_json(options={})
    Api::Post.new(self).as_json(options)
  end
end

What’s the difference between this and putting the as_json method into Post directly? More control, separation of concerns with application modeling vs presentation and the big win is when breaking changes occur within the API. Now we can put version relevant information into new objects, or into the serialised class itself.

class Api::Post
  attr_reader :post, :version

  def initialize(post, version)
    @post = post
    @version = version
  end

  def as_json(options={})
    send("v#{:version}")
  end

  private
  def v20130505
    # version specific JSON
  end

  def v20121206
    # version specific JSON
  end
end

Through this we have versioning information in one place and through a request parameter of something like v=20130506 the application can handle multiple versions in one object. For me, this ultimately removes URIs like /v1/posts, but why is that important? The URI is an identifier which points to a resource and having v1 or v2 in the URI muddies the fact that the two identifiers are pointing to the same resource. Using a request parameter, much like pagination is handled, means we can ask for a representation of that resource rather than having to specify different resources. Then we can do away with needing controllers such as Api::V1::PostsController and just deal with Api::PostsController or even just PostsController and deal with the versioning within the object instead of the URI path.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

ElementalJS and SimpleBDD open source updates

Robbie Clutton
Tuesday, May 14, 2013

Thanks to the benefits of Open Source software and working with great people, I’m pleased to announce some updates to both ElementalJS and to SimpleBDD.

ElementalJS

Thanks to Ian Zabel who made a performance improvement to ElementalJS after fighting a large DOM in Internet Explorer. Elemental will now load the behaviours much quicker if the document is passed as no filtering will take place. If another node is passed, filtering will be applied but the thought is that DOM will be much smaller so hopefully won’t hit this issue.

SimpleBDD

Thanks to Adam Berlin who made two improvements to SimpleBDD. First was the addition to also to the syntax and second was NoMethodError is replaced by pending if using RSpec.

More improvements were made by Daniel Finnie which allow use of some non alphanumeric characters that get turned into method names.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

Stop leaking ActiveRecord throughout your application

Robbie Clutton
Monday, May 6, 2013

Extending ActiveRecord::Base leaks a powerful API throughout an application which can lead to tempting code which breaks good design. Take the classic blog example where you may want to retrieve the latest posts by a given author. You may have seen, or even written code that gets the dataset you need straight into the controller or view:

Post.where(author_id: author_id).limit(20).order("created_at DESC").each { ... }

For me this is a design violation as well as breaking the “Law of Demeter”[Edit: Current Pivot Adam Berlin and former Pivot John Barker pointed out that chaining with the same object was not a Demeter violation]. The example above tells me structure of the schema that the calling class has no business knowing. It also makes testing using stubs ugly and encourages testing against the database directly. A test would have to chain three methods to stub a return value. It’s brittle, as in it’s susceptible to breaking due to changes outside of the class. For me it also fails from a narrative perspective in that it doesn’t succinctly reveal the intent of this part of the application.

If we were testing this and attempting to use stubs, we’d have to write something like the below. You can see how this is at best cumbersome, but also fragile.

where = stub(:where)
limit = stub(:limit)
order = stub(:order)

Post.stub(:where).with(author_id: author_id) { where }
where.stub(:limit).with(20) { limit }
limit.stub(:order).with("created_at DESC").and_yield(post1, post2, post3)

You may be forgiven for thinking you could chain the stubs like below, but the arguments are ignored and this just serves to highlight the breaking of the ‘Law of Demeter’.

Post.stub_chain(:where, :limit, :order).and_yield(post1, post2, post3)

I’d much rather see that as a message to the Post class.

def self.latest_for_author id
  where(author_id: id).limit(20).order("created_at DESC")
end

Post.latest_for_author(1)

If there were variations of the limit and perhaps offset, they can be passed as option parameters of as an options hash:

def self.latest_for_author id, limit = 20, offset = 0
  where(author: id).limit(limit).offset(offset).order("created_at DESC")
end

Post.latest_for_author(1)
Post.latest_for_author(1, 20, 0)

or

def self.latest_for_author id, options
  limit = options[:limit] || 20
  offset = options[:offset] || 0
  where(author: id).limit(limit).offset(offset).order("created_at DESC")
end

Post.latest_for_author(1, offset: 20)

In order to get the dataset the call looks like the following, and I think is more informative than using the ActiveRecord DSL directly.

Post.latest_for_author(author_id).each { ... }

Testing is also easier, as it puts more emphasis on the messages being sent to objects rather than a chain of calls having to be correct.

Post.should_receive(:latest_for_author).with(1).and_yield(post1, post2, post3)

There are a few advantages to this refactor:

  • Only the Post class knows about the schema
  • Any changes to the implementation of what latest_for_author are encapsulated in one place
  • The method describes the intent more than the implementation
  • Stubbing in the tests are easier as there is one clear dependency
  • Testing the database is encouraged only in the class hitting the database

One further refactor could be done here, and that is to move the query logic out of the Post class once more, but this time into a purpose built query Object:

class LatestPosts
  attr_reader :author_id

  def initialize author_id
    @author_id = author_id
  end

  def find_each(&block)
    Post.where(author_id: author_id).limit(20).order("created_at DESC").find_each(&block)
  end

end

Where using the class looks like:

LatestPosts.new(author_id).find_each { ... }

Here’s what Bryan Helmkamp has to say on query objects in his excellent write up on fat ActiveRecord models. Bryan here rightfully points out that once in a single purpose object, they warrant little attention to unit testing. Now is the right time to use the database to ensure the right data set is being returned and that N+1 queries are not being performed. This means that database testing would only occur within the class actually hitting the database and not the rest of application which has a dependency on the database.

All of these techniques discussed serve to improve the design of an application by preventing leaking responsibilities from one class throughout the rest of the application. I’m also not saying that developers shouldn’t be using ActiveRecord or even Rails, but to use the tools responsibly.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

Single resource REST Rails routes

Robbie Clutton
Tuesday, April 16, 2013

REST principles by default is a fantastic convention within Rails applications. The documentation for how to route HTTP requests are comprehensive and give examples about photo resources within an application. If you’ve got photo and tag as first class resources of your application, Rails has you covered. But what if you are building an application with a focus on one type of resource, do you really want /resource_type as a prefix to all of your application paths? I certainly don’t and I’ll show you how to remove that without diverging from Rails core strenghts.

For better or worse, I’m always conscience of making sure applications I’m involved in have Cool URIs and sometimes that does mean fighting the Rails conventions. However Rails routing is very flexible and can provide me with the application paths that make me happy.

Take Twitter as an example. Every user has their username as a top level path, so instead of having /users/robb1e, they simply have /robb1e. When dealing with an application where there is one core resource it can make a lot of sense to strip the resource prefix. This can be achieved through scopes in the routing configuration.

Your::Application.routes.draw do
  scope ":username" do
    get '', to: 'users#show'
  end
end

Gives you routes which look like

           GET  /:username(.:format)                users#show

If you wanted to see the followers and followees of that user, you have two options. Return to the default resource or use HTTP verb contraints. I’ll show you both.

Your::Application.routes.draw do
  scope ":username" do
    get '', to: 'users#show'
    resource :following, only: [:show]
    resource :followers, only: [:show]
  end
end

This adds the routes

following GET  /:username/following(.:format)      followings#show
followers GET  /:username/followers(.:format)      followers#show

Alternatively HTTP verb constrains can be used to achieve a similar result.

Your::Application.routes.draw do
  scope ":username" do
    get '', to: 'users#show'
    get '/following', to: 'user#following'
    get '/followers', to: 'user#followers'
  end
end

This gives the paths

          GET  /:username/following(.:format)      user#following
          GET  /:username/followers(.:format)      user#followers

If you are trending into paths unknown, you always have the safety of tests to help you out. Both Rails and RSpec have ways to test your application routes.

One gotcha which using the default resource routing removes is clashing paths. If you decide to build an admin page and want to put that at /admin, that needs to be in the routes config before the scoped block and if a user has given themselves the name of admin then you may be in for some fun.

So the next time a need arises for an unconventional route, check the documentation, it’s probably possible although almost always warants thinking about.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

Failed attempt at trying to use refinements

Robbie Clutton
Monday, April 8, 2013

I was pretty interested in refinements in Ruby 2.0, and after listening to the latest Ruby Rouges podcast where some serious doubts were raised about the viability of refinements I thought I’d build a little example of how I was thinking I could use it.

I failed first time out and I tried copying and pasting examples without success. After some time poking around I found a blog post about why this wasn’t working. Ultimately, refinements are sort of left in the language, but not fully supported and are marked as experimental.

Here’s what I wanted to achieve. Coming from Scala in my previous job, I thought I could use refinements as a proxy for implicit conversions. Here I refine the Fixnum class to allow it to respond to a to_currency message. When called it converts the Fixnum instance to a Currency instance.

class Currency
  attr_reader :units

  def initialize units
    @units = units
  end
end

module CurrencyExtensions
  refine Fixnum do
    def to_currency
      Currency.new(self)
    end
  end
end

class App
  using CurrencyExtensions

  def initialize
    puts 3.to_currency
  end
end

App.new

Why is this interesting to me? Well, I think being able to write 3.to_currency can result in nicer to read code than the alternative Currency.new(3). Small difference perhaps.

There is a way to get this to work by having the using keyword in the global context, but it doesn’t deliver the full impact I was hoping for.

class Currency
  attr_reader :units

  def initialize units
    @units = units
  end
end

module CurrencyExtensions
  refine Fixnum do
    def to_currency
      Currency.new(self)
    end
  end
end

using CurrencyExtensions
puts 3.to_currency

I’ve got other ideas to build upon this if it ever makes it into the full specification. I’ll keep watching for now.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

NY Standup: Documentation interpolation and RVM, Brew and autolibs FTW

Robbie Clutton
Tuesday, April 2, 2013

Interestings

Grant Hutchins / David Lee

If you wrap the opening name of a Ruby heredoc in single quotes, its insides will act like a single-quoted string.

If you wrap the opening name in backticks, the heredoc will be immediately executed through system()

<<-HEREDOC
1 + 2 = #{1 + 2}
HEREDOC
=> "1 + 2 = 3\n"

<<-'HEREDOC'
1 + 2 = #{1 + 2}
HEREDOC

=> "1 + 2 = #{1 + 2}\n"

<<-HEREDOC
uname
date
HEREDOC

=> "Darwin\nMon Apr 1 17:50:43 EDT 2013\n"

Found at http://jeff.dallien.net/posts/optional-behavior-for-ruby-heredocs

rvm autolibs

https://rvm.io/rvm/autolibs/

In the latest versions of RVM, you can have RVM build dependencies automatically for you. For instance, if you use Homebrew, you can run

$ rvm autolibs brew

and from then on, RVM will automatically build any packages it needs via Homebrew.

As always, read "brew doctor" to get more info about how to set up your particular system.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

Thoughts on Simple BDD

Robbie Clutton
Saturday, March 30, 2013

A small number of projects here in New York have adopted my extremely simple behaviour driven development library, SimpleBDD, and I thought I’d share some of the emerging patterns those teams have developed while using it.

SimpleBDD, is a way of using a Gherkin like specification language and turning into a method call with Ruby. It takes the approach of tools like Cucumber but reduces it down the smallest set of features. Essentially taking a method call:

Given "a logged in user"

and calling an underlaying function:

a_logged_in_user

If that method is in the scope of the executing test, that method is executed, if the method isn’t in scope or doesn’t exist the standard method not found exception is raised. It enables a developer to produce Gherkin like specifications while staying in Ruby, using the test framework of choice.

I generally try and follow the advise of my colleague, Matt Parker, with his excellent post on steps as teleportation devises. We try and create a reusable and stateless domain specific language (DSL) for our tests and our steps call into the DSL and hold state pertinent to that particular test run.

At first on my current project we have had three separate areas for our request specs. We have the request spec itself which used SimpleBDD to describe the behavior of the application. We then had a ‘steps’ file which had the methods calls from the SimpleBDD and translated those into the reusable DSL for the application. The steps file was reused across all request specs and was becoming big pretty quickly.

Dirk, on another project which is also using SimpleBDD, skipped the ‘steps’ file and placed those methods straight into the rspec feature files underneath the scenario blocks. Then after some discussions with JT on where to keep the state our tests depended on, Brent and our team started using rspecs ‘let’ methods and the ‘steps’ within the scope of the feature block to keep the intention of the test in one place.

By also putting more responsibility onto the DSL, these methods are pretty dumb, leaving the test describing what the test is attempting to achieve through SimpleBDD method calls and the how through calling the DSL via the ‘step’ methods within the feature block in the same file.


require 'spec_helper'

feature 'homepage' do

  scenario 'happy path' do
    Given "an existing user with one widget"
    When "the user visits the homepage"
    Then "the user can see their widget"
  end

  let(:user) { FactoryGirl.create(:user) }
  let(:widget) { FactoryGirl.create(:widget) }

  def an_existing_user_with_one_widget
    user.widgets << widget # Using the application code to create initial state
    login user # Application DSL
  end

  def the_user_visits_the_homepage
    visit root_path # Capybara DSL
  end

  def the_user_can_see_their_widget
    can_see_widget widget # Application DSL
  end

end

This has been working out fairly well for us and if you're interested in a simple version for BDD for your project I hope you check this out.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

Surprisingly Simple Epic Wins

Robbie Clutton
Saturday, March 16, 2013

A surprising amount of simple can get an application over a number of speed bumps. We’re going to look up and down the whole application stack and use stories to show what simple things people have done to build a sustainable system without re-architecting.

You’re gonna need a bigger boat

One of my favourite stories from a colleague was when they consulted for a previous company. The application was struggling to scale and the answer from the development team was vertical scaling. Buying bigger servers, or putting in more RAM would buy more time for the application to keep ticking over until next time. My diligent colleague when joining this team spent some time digging around trying to find bottlenecks in the application. They discovered that there was not a single index on the database and that pretty much every transaction was doing a full table scan to get the result. The previous answer to get more RAM was so that the entire database could fit within it, in an effort to increase the performance.

OK, so no indexes at all seems like an extreme case, but it can be easy to skip an index or over index. Small companies don’t tend to have database administrators and if a test isn’t failing and no-one is complaining in production it’s easy to see that this area can be skipped without realising. There are some tools out there like ‘Rails Best Practices’ which can help identify where indexes are missing. Some simple changes and checks and drastically improve performance and delay that re-architecting we’re all afraid (or excited depending who you are) to do.

Instrument, Instrument, Instrument

A different colleague started their previous job just as a major rewrite was nearing its completion. There was some stress as a move from ColdFusion to Ruby was not paying off the dividends the team had sold to the product owners; the application performance wasn’t good enough to go live with. Tests were green and no bugs were reported so no light was shed on the troubled areas of the application. By adding instrumentation using a tool such as NewRelic, slow processes and queries were found and refactored. Over the course of a week, working on and off the problems the performance was brought up to an acceptable level where the application could go live.

In a way, this was not a terrible position to be in. Performance is one of those things that it isn’t a problem until it is, and just before pushing a release out of the door seems like an ideal time to do some performance testing. The tool NewRelic itself can be used locally for this and can run as a hosted service against real production requests. On teams I’ve been involved in, I like to go through the slowest requests on and identify and fix any problem areas as a Friday afternoon chore. Instrumentation doesn’t have to come through a tool like NewRelic, it can be looking at logs of web request times and slow database queries, but taking some time to fix these can make some significant improvements.

Caching in

There are a number of caching techniques I’ve heard about and seen. Some have been effective, others have created more problems they had set out to resolve. First a cautionary tale.

Like scaling a database by putting the entire database into memory, caching can obscure underlying issues. On a recent project caching mechanisms were scattered through the code, where to cache, where to invalidate. This meant that when caching something, either through the application or even changing code, we couldn’t be sure if the caches would be affected.  In one instance, our production environment was showing some strange behaviour that we could not replicate. After digging around, we found that the caches were being invalidated by the last update of the object being cached, but we had changed the template within the cache, leaving some objects being presented with the old template, and others with the newer one.

Phil Karlton’s quote “there are only two hard things in Computer Science: cache invalidation and naming things” springs to mind.  The lesson here is caching can increase performance significantly but can hide issues. By caching the result of slow running code, are you hiding code that could be improved?

Rails 4 has tried to solve some of these issues by suggesting that applications generally only cache the results of rendered views. It also takes away the cache invalidation part by using the objects name, last updated time and the MD5 of the template being rendered as part of the keys. Using a caching system which automatically drops the least used cached entries should be sufficient to deal with this.

Caching out

Sometimes the responsibility of caching can be handed to another part of an applications’ infrastructure. This is exactly what we had done on some projects when I was working on the Guardian. The applications we were writing depended heavily on external services for data and these services, being good citizens of the web, had returned appropriate cache headers in the HTTP responses. For this given application, we didn’t want to model the data coming back, we merely wanted to transform the response and place in a HTML template. Using a HTTP caching proxy like Squid installed on the same server as the application making outbound calls meant we could rely on Squid to do the caching. There was the HTTP request out, but as this never left the server, it was a small hit.

201 Created

Donald Knuth said that “premature optimization is the root of all evil” but there are a series of small optimisations that can be worthwhile. When making a request to a web server or external service, an application is either changing or reading state, sometimes both at the same time. When reading back state at the same time in the same request as changing it, if an application performs only the work necessary for the response and puts the rest of the work in a background job of some implementation, the application can respond more quickly. RFC 2616 HTTP response codes 201 and 202 were made for this sort of operation.

The response code 201 is useful for letting the client know a resource has been created. In one of my first projects in telecommunications, we would send a 201 to indicate a phone call had been started. The client would request a call be made between two phone numbers, but we didn’t want the request to be tied up during the actual phone call and maybe returning only when the call was finished.  A 201 with a location header for the client to get the status of a call was an idea choice. A resource (the call) had been created and had an address which the client could use.

A more web based example could be signing users up to a new website. If there are welcome emails to be sent and mailing lists to be joined it’s not in the applications interest to make the user wait while SMTP gateways respond and third party services give their OK to a request. If this is spun out into a thread or background job the application can return to the user and allow them to carry on. The less essential processes will happen, but the delay that occurs is acceptable and the user gets a more performant experience.

Perceived performance

Steve Saunders and the Performance Group at Yahoo have done great work with tools like YSlow and highlighting the issues with perceived performance. When discussing this issues, Saunders said “Optimize front-end performance first, that’s where 80% or more of the end-user response time is spent”. So much of the advice YSlow suggests is so simple to implement that I recommend just setting it up at the beginning of the project and checking the advice every now and then.  Some of the suggestions are one off and can be done at the start of a project, others require some ongoing checking, but these techniques will give a faster experience for the end users.

An interesting example is where this has been taken to an interesting level has been the recent rewrite of the Guardian mobile website.  The application focuses on the most important aspect: the content. Javascript has been stripped back to only what is required to load the content, there’s no jQuery in sight.

The main Guardian website with an empty cache downloaded 1.06mb with 212 requests. It took 2.5 seconds for the DOM to download it’s content and 4.3 seconds before ‘onload’ was fired.  For the mobile version of the website, it downloaded 25k of data with 77 requests with the DOM content load event happening after 260ms and ‘onload’ being fired after 950ms.  That’s a pretty big difference and sometimes that effort is warranted.

Another technique used by the new Guardian mobile website is conditional loading of content. Say on a sports team page, after the main content has been downloaded and the reader is celebrating or sobbing depending on their team news, asynchronous calls are made for extra content. In this case, fixture/schedule information, results and related content. This information isn’t required for the reader to achieve the main purpose of their visit to the page, but it might help keep them there. Using conditional loading, the page loads quickly, the reader can start reading the article without having to wait longer for a bigger DOM or synchronous loading of the extra content.

Real-time vs Near-time

One important I’ve personally learnt is when asked to build something in ‘real-time’, it to ask the product owner to define real-time. Real-time can mean something very different to a developer than to a product owner or user. For the longest time I equated real-time with ‘immediately’. When I discovered the actual requirement was ‘within a reasonable time but not necessarily immediately’, this changed many assumptions I’d made about the application.

A company that perform complex calculations to give a user a view to potential savings when switching to a different service provider, the product team had requested the price update in real-time. The developers knew that crunching the numbers can actually take a non-trivial amount of time, so they put the processing in a background job and used the HTTP meta-refresh element to refresh the page and ultimately hold the user in place, followed by a redirect to a page with the crunched numbers.

This may sound crude, but when considered against the time to build a better, more ‘real-time’ experience, it was an easy choice. Consider too that this pattern is often employed during a checkout experience, especially when booking something like a flight. After your credit card information is taken, you’re taking to a holding page until the payment is confirmed and a seat reserved.

Meanwhile, in another part of town, a team was content with itself after building a ‘real-time’ application that displayed government bond finances and gave 10 second updates. They relied on a message queue publish-subscribe architecture to get the very latest information to the users. When they went out to watch the application in the wild, they found that those people watching these bonds would take a look every few minutes in-between doing other tasks. It turns out that working on bonds is not the same as working on the stock exchange. The application had been over-engineered for a problem that didn’t exist. After realising this, they could correct their course and remove complexities from the application while remaining ‘near-time’ for their users.

Take a breath

We’ve looked at some simple changes that can bring big benefits and how user perception of performance is probably more important that server side performance. We’ve also seen how asking the right questions can lead to simplicity in itself.

Step two to a successful business: know the product, embrace the tools that show weaknesses and learn where best to invest your time.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

Always be validating

Robbie Clutton
Monday, February 18, 2013

Big companies with well established products and business practices often differ from smaller, younger, start-up companies with the cash flow available to prove business ideas. The startup companies will often have a “runway” or “burn-rate”, meaning they have a limited amount of cash to keep the lights on. Big companies have budgets, committees and risk departments. Those companies may take a risk on a new product, but if it fails the lights will stay on and people will more than likely still have a job. Established companies will also invest in existing products to strengthen their market position which may result in growth targets rather than a focusing on validating those targets.

Too big to fail

New products should always look for market validation regardless of the size of the company. Just look at how long the HP Touchpad (48 days) and the Microsoft Kin phone (49 days) lasted in the marketplace before being withdrawn. It’s reported that HP would go onto lose up to $300m after cutting the Touchpad and associated webOS product line along with job cuts. So much for market validation and their risk department. Another huge loss was RIMs Playbook, reported to have lost the company $1.5bn in a period when the company really needed a boost in the marketplace amid falling Blackberry sales.

Not since the dot com boom have startup companies had anywhere near that sort of money to take a product to market and while these examples are hardware based incurring the associated research and development costs there is a lesson about validation in there regardless of the size of the company. Software companies seem to fail in three categories: massive dot com blowouts (e.g. WebVan and pets.com); being squashed by a competitor (Friendster to MySpace to Facebook); and the ones you’ve never heard of as they failed before they made it.

That’s not to say that the companies that are using metrics and data to help make their decisions don’t push the process too far. The example of testing 40 shades of blue at Google is probably the best known. However, a little information can go a long way.

Agile and counting

How should a company with a limited cash flow proceed? That company needs to know that every feature built is going towards a larger goal, is required and ultimated validated by the users. How does a company know when a feature has been validated?

Each new feature should have some validation criteria, e.g. “Removing password confirmation should result in decreased number of users getting stuck and leaving at the registration stage”. This then needs to be measured, which means we need to have that measurement before the feature is written ideally.

There are a number of tools available to start gather metrics, Google Analytics is good for general visitor information but tools like KissMetrics have gathered support for more of data driven demands. There are also open source alternatives such as statsd and Cube if the project has the time to invest in their own tools. If the hypotheses are more visually driven, tools such as CrazyEgg have proved useful for validating calls to action and discoverability within an application.

As an example, something as simple as CrazyEgg can show where users of an application get confused. On a recent project we deployed CrazyEgg on the homepage and almost immediately we could see that users were clicking on headings like they were links. You could almost sense their frustration through the heatmap when a click didn’t change the page.

On another project we could see we were dropping a high number of users through a certain part of the application funnel. At this point we did some user testing and tried to focus on this area where users were dropping off. We found that the application, which had a “real-time” price for the service the user was getting a quote built in some assumptions for earlier selected defaults but without telling the user that had happened the users were getting put off by the high quote. We could hypothesise on what might increase the throughput of the funnel, make those changes and see which one worked best.

Tools and talent

There are some tools to help follow a feature through to validation. Tools such as Lean Launch Lab or more general tools such as Trello can help monitor the success of features and their validation.

So you have the tools, you have the talent. Now what? You need to know where the application currently stands with the metrics which drive the business. When a new feature is written it should include a hypothesis about what metrics will change when the metrics goes live. The feature gets built and goes live. How do you follow what happens next? Most tools and processes stop here, features have been accepted and is live, what more do we need? Those features should be monitored while the product team measures the effectiveness of the change. Did that feature make the hypothesised impact? If not, should the feature be re-evaluated or thrown out?

Step one to a successful business: know the product, understand the users and always be validating.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

Testing strategies with RSpec, NullDB and Nosql

Robbie Clutton
Sunday, February 3, 2013

Recently I had posted about a few testing strategies that can be applied with RSpec. One of the patterns I mentioned was using something like NullDB to ensure your unit tests were not hitting the database. I had a few conversations about what I’d written, notably from my colleague Ian Lesperance. We discussed, and I conceded, that it’s preferable to have tests related to one class in one spec file. In particular I had split out the tests for the unit level and the integration with the database tests. So, here are my experiments on how I brought those tests back while keeping the same integrity of using a database for some test and forcing the null object pattern on other tests.

I had some issues having those tests in the same file, but with a little help from another colleague, JT Archie, we managed to figure it out.

Consider this rspec test:

  describe Widget do
    describe "#higest_selling", :db do
      it "uses the 'highest_selling' scope" do
        ...
      end
    end

    describe "#display_name" do
      it 'concats the widget name and manufacturer' do
        ...
      end
    end
  end

The ‘higest_selling’ method is a scope and has the ‘:db’ tag associated to the block, while the ‘display_name’ test has no tags applied. I wanted this to be the case, no tags means no database but if you want to hit the database, you need to explicitly call it out.

One trick you might have missed above was no longer needing to do ‘db: true’ in the RSpec tag. With the following setting in the spec helper, you can apply a symbol directly like ‘:db’.

config.treat_symbols_as_metadata_keys_with_true_values = true

Testing with NullDB

To get this working, I had to use the HEAD revision of NullDB:

gem 'activerecord-nulldb-adapter', git: 'git://github.com/nulldb/nulldb.git'

Using NullDB within the same file, we can use the ‘nullify’ and ‘restore’ helpers, but I found it worked best using the ‘around’ configuration. Using ‘before’ and ‘after’ I was having issues with changing the connection adapter during a transaction. This way, it appears to get around that issue.

We run the configuration block around each test that has the ‘type: :model’ tag. RSpec-Rails applies these automatically to any tests in the ‘spec/models’ directory. We look to see if the example has the ‘:db’ tag and if it does, we restore the default connection adapter, and run the example. If the example does not have the ‘:db’ tag applied, we apply the NullDB adapter, run the example and then restore the default adapter.

Within the ‘spec_helper.rb’ file:

  config.around(:each, type: :model) do |example|
    if example.metadata[:db]
      NullDB.restore
      example.run
    else
      NullDB.nullify
      example.run
      NullDB.restore
    end
  end

Testing using stubs

There are other options and with a sizable amount of help from JT, we created a simple way to achieve a similar outcome. Under ActiveRecord there are two methods which actually hit the database, ‘exec’ and ‘exec_query’. These methods can be stubbed out much like any method on any object in an application codebase.

In the ‘spec_helper’ file, we replace the NullDB configuration with the following. We again check for the ‘db’ tag and if it’s not there we stub ‘exec’ and ‘exec_query’.

  config.around(:each, type: :model) do |example|
    unless example.metadata[:db]
      ActiveRecord::Base.connection.stub(:exec).
        and_raise("You're not allowed to do that")
      ActiveRecord::Base.connection.stub(:exec_query).
        and_raise("You're not allowed to do that")
    end
  end

Testing using Nosql

We took this concept one step further and created a Gem that wasn’t RSpec specific. We couldn’t believe our luck when RubyGems showed there was no Gem called ‘nosql’, so with that problem solved we created the Nosql gem. When included in a test suite, any call to the database will raise an exception.

With the around configuration block Nosql is disabled and enabled accordingly.

  config.around(:each, type: :model) do |example|
    if example.metadata[:db]
      Nosql::Connection.disable!
      example.run
    else
      Nosql::Connection.enable!
      example.run
      Nosql::Connection.disable!
    end
  end

All three of these options force unit tests to not hit the database. Database calls will either be ignored (NullDB), or will raise an error (Nosql). This should result in decreased execution time for tests as it will encourage the developer to stub out those interactions with the database.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter
Robbie Clutton

Robbie Clutton
New York

Subscribe to Robbie's Feed

Author Topics

api (1)
architecture (1)
ironblogger (10)
object-design (2)
rails (4)
rest (2)
elementaljs (2)
opensource (1)
simplebdd (2)
activerecord (2)
routing (1)
refinements (1)
ruby (5)
scala (2)
autolibs (1)
brew (1)
documentation (1)
rvm (1)
bdd (3)
rspec (4)
testing (8)
lean (2)
startup architecture (2)
sustainable archtiecture (1)
yagni (1)
metrics (1)
validation (1)
nosql (1)
nulldb (1)
javascript (3)
ci (2)
jasmine (1)
build (1)
agile (1)
review (1)
web (1)
  • About
  • Case Studies
  • Team
  • Community
  • Careers
  • Contact
  • Labs
  • Events

Contact Us

contact@pivotallabs.com
+1 415-77-PIVOT
TwitterLinkedInFacebook

Pivotal Tracker

Tracker is the award-winning agile project management tool that enables real-time collaboration around a shared, prioritized backlog.
Visit pivotaltracker.com >