Pivotal Labs

Main menu

Skip to primary content
Skip to secondary content
  • About
  • Case Studies
  • Team
    • Executives
    • Locations
      • San Francisco (HQ)
      • Boston
      • Boulder
      • Denver
      • London
      • Los Angeles
      • New York
  • Community
    • Blogs
    • Tech Talks
    • Events
  • Careers
    • Lifestyle
    • Principles & Practices
    • Benefits
    • FAQ
    • Apply
  • Contact
    • Press Room
    • Press Releases
    • In The News
    • Press Kit
  • All
  • Labs
  • Standup
  • Tracker

To build a bookmarklet

Robbie Clutton
Monday, June 17, 2013

Building a bookmarklet provides an interesting challenge. It involves interaction a website your application does not control where that site could be anything with any number of dependencies on CSS or Javascript libraries. The first choice to make is trying to work with that website and probably setting an !important on every CSS selector used and hope that there’s no namespace or versioning clashes with any Javascript included; or use an iframe.

Iframes seem to have fallen out of favour in recent times but the sandboxed nature of the content inside an iframe mean the worries of CSS and Javascript clashes are gone. However this is replaced with a communications overhead of communicating between the host document and the iframe. I wanted to touch on some of the things our team did on a recent project to try and make this fairly seamless.

Getting into the DOM

A bookmarklet is a small piece of Javascript that a user can drag onto the bookmark bar and upon pressing the link the Javascript will run. Because there is no guarentee what site will be loaded and if that will have any number of Javascript libraries included it’s best to use plain old Javascript to create an iframe element and append it to the body of the document. Our bookmarklet also needed to have some javascript if for nothing else but to be able to dismiss and remove the newly created elements. This can be done through the creation of a script element and appending to the body just like the iframe itself.

Once the iframe and script tags are appended they are treated the same as any other element. The content is loaded and the script is executed. The next step is getting the window to talk to the iframe. As a convienence the domain with protocol and port of the iframe is stored in a variable for later use.

element = document.createElement('iframe');
element.id = 'example_iframe';
element.src = 'example.com?referrer=' + window.location;
document.body.appendChild(element);

script = document.createElement('script');
script.src = 'example.com/bookmark.js';
document.body.appendChild(script);

The location is also sent to the remote server to load the iframe so that it can also store that location for passing messages to the host.

At this point this Javascript could also append a script tag to a version of libraries that may be required for it’s own application to run. It could also test for the existance of that library before hand so it doesn’t bring down an incompatible version. For example, bringing in jquery:

if ($ === undefined) {
    var jq = document.createElement('script');
    jq.src = "//ajax.googleapis.com/ajax/libs/jquery/1.10.1/jquery.min.js";
    document.body.appendChild(jq);
}

Sending a message

The postmessage method is available to communicate bewtween the window and the iframe. The host window with a reference to the iframe can call postmessage with a string as a message and as a security measure the target location. We had stored this during the loading of the elements as described above.

iframe.postmessage('hello', 'www.example.com');

That message won’t get anywhere unless the iframe is listening for the message event on the other end.

// native Javascript
window.addEventListener('message', function(event){ … });

// jQuery
$(window).on('message', function(event){ … });

I’ve used the native Javascript above but really, once in the iframe itself the application has full control and could use JQuery or any other library at this point. Our application needs to listen to messages on both sides though so we needed the above to run in the host anyway.

RPC

Sending a message is all well and good but any non-trival application is going to have more than one function to run. We took influence from Remote Procedure Calls (RPC) to call functions within the host and remote sites. The message sent was stringified JSON with a very light ‘schema’ of the function to run and parameters to send to the function.

{   
    f: 'theFunction',
    params: {
        ...
    }
}

The recipient could then parse the string it knew to be JSON, extract the function to call and call it with any optional parameters also sent. This does create a binding between the host and the iframe but as the appliction controlled both sides we deemed it an acceptable risk. The function can be run from the window like so:

window['theFunction']();

// or from the event listener
var fn = JSON.parse(event.data).['f'];
window[fn]();

Dealing with namespaces

It was mentioned earlier that one of the goals of using an iframe was not to clobber any Javascript namespaces but we did end up including Javascript in the host and to avoid this we used an application namespace. However calling that as a property on the window would no longer work.

// doesn't work
window['my.app.function'](); 

// works
window['my']['app']['function']();

We looked to ElementalJS as an example of dealing with namespaced functions to parse the function.

window.addEventListener('message', function(){
    var fn = window; 
    var data = JSON.parse(event.data);
    var namespaced = data['f'].split('.');
    for (var i in namespaced) { 
        fn = fn[namespaced[i]];
    }
    fn(data.params);    
});

This is natually fairly crude, some defensive code could be added but this demonstrates the intent of the processing. Defense like making sure that function existed, or that ‘f’ existed in the data for the window could receive a message from another iframe.

Putting it all together

The bookmarket inserts two elements, an iframe and a script. The iframe has the source example.com?referer=bar.com. bar.com is inserted as a variable in the iframe Javascript code. The iframe inserted has an id of example_iframe

The host Javascript listens to the message event

window.addEventListener('message', function(event){
    var fn = window; 
    var data = JSON.parse(event.data);
    var namespaced = (data['f'] || "").split('.');
    for (var i in namespaced) { 
        fn = fn[namespaced[i]];
    }
    if (typoe(fn) === 'function') {
        fn(data.params);    
    }
});

The host also sets up a namespace and a function to be called.

window.example = window.example || {};

example.hello = function(){ … }   

When the iframe has loaded, it sends a ready message to the host

$(document).ready(function(){
    var data = JSON.strinify({f: 'example.hello'});
    parent.postMessage(data, referer); // referer set by server from the request param for the iframe
});

The host from the script loaded in the bookmarket has the example.hello function and it’s run. This in turn replies to the iframe.

var example.hello = function(){
    var iframe = document.getElementById('example_iframe');
    var data = JSON.stringify({f: 'example.world'})
    iframe.contentWindow.postMessage(data, 'example.com');
};

The iframe has an event listener which is the same code as the host, and runs the function example.world

var example.world = function(){
    // hello, world
};

Wrapping up

This has shown some of the techniques for a ‘hello, world’ bookmark with two way communication between host and iframe that uses Javascript namespaces. This was enough to get our application off the ground as the two way communication acted as a solid base to build upon.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

API Versioning

Robbie Clutton
Friday, May 31, 2013

How to version an API has been a thoroughly discussed topic in the last several years regardless of protocol or approach, be that SOAP, REST or Hypermedia. Why contribute another post to the topic? My last post on avoiding breaking changes through better design led to conversations in the office and online about versioning.

@moonmaster9000 @robb1e @jaderubick you should version using a vendor string in the Accept header.

— David Celis (@davidcelis) May 16, 2013

I wanted to dig a little deeper into versioning and look at some of the different ways people are using versioning for their resources on the web.

Compatibility as version control

Mark Nottingham has written a lot of the subject of API versioning and it’s difficult to disagree with the high level arguments outlined in his API evolution post.

  • keep compatible changes out of names
  • avoid new major versions
  • makes changes backwards-compatible
  • think about forwards-compatibility

In another post Nottingham adds that we shouldn’t use a custom header for versioning and touches on versioning through the Accept header but there is a fundamental thread here: try as hard as possible to not introduce breaking changes so that versioning isn’t a big issue.

One of the issues to content with when considering versioning is often building APIs requires more thought up front that an agile developer might be used to. URIs, Data structure, meta-data and extensibility are important and would be best considered up front. Once those decisions have been made changes to the structure often result in breaking older versions of the API. Anyone at an early stage of building an API would do well to put some thought into the API design and setting some rules for future consistency.

Accept header

The Twitter comment above pointed me towards using the Accept header as part of content negotiation and I found a number of blog posts covering the subject as well as some companies using this approach.

RFC4288 section 3.2 outlines how a vendor, i.e. an application, can make use of customisable MIME types in the Accept header. Steve Klabnik shows how an application can make use of this to include a version number as part of the Accept header used for content negotiation.

Looking at some concrete examples through the well documented Github API shows how the Accept header can be used with their services:

Accept: application/vnd.github[.version].param[+json]

curl -v -H 'Accept:application/vnd.example.v1+json' localhost:3000

The vnd part is the vendor definition as outlined in RFC4288. Let’s take a look at a number of options to make use of this header.

Parsing the Accept header

Within an application, headers can be inspected from a HTTP request. Extracting the version number would require some regular expression matching, something along the lines of the following in an application_controller perhaps:

def api_version
request.headers["Accept"][/^application\/vnd\.github\.v(\d)/, 1].to_i
end

Once the application has this version number it can decide how to behave for the response.

Registering the MIME type

An application could alternatively register a MIME type for each and use the respond_to blocks to decide how to render a response.

To register a MIME type in config/initializers/mime_types.rb

Mime::Type.register "application/vnd.github.com.v1+json", :json_v1
Mime::Type.register "application/vnd.github.com.v2+json", :json_v2

MIME types also allows parameters but would require registration for each one also:

Mime::Type.register "application/vnd.github.com+json; version=1", :json_v1
Mime::Type.register "application/vnd.github.com+json; version=2", :json_v2

So a controller could look something like the following and deal with the request appropriately.

posts = Api::Posts.all

respond_to do |format|
    format.json_v1 { posts.v1.as_json }
    format.json_v1 { posts.v2.as_json }
end

Version via a request parameter

The thing with the version in an Accept header is the URI is difficult to share. If I wanted to share the URI with version information with a college I would have to send instructions on what curl arguments to send to the server to get the right response. It’s can be as frustrating as trying to share a holiday on a website that renders pages based on what’s in the session for the individual. I would want to share a URI that someone can paste into a web browser address bar and see an appropriate response. Let’s compare the two approaches:

curl -v -H 'Accept:application/vnd.example.v1+json' localhost:3000

vs

curl -v localhost:3000/?version=v1

or

curl -v localhost:3000/?version=20130603

Version number as a date

The Foursquare API allows clients to send a version as a date in the format yyyymmdd which conveniently is an always increasing number. When a client starts to use the API they can use that days date and the response will always be in that format. I like this concept as it removes the burden of having to know what endpoint to use or what version to send in a header. The client uses a known point in time and the response will always match if that is sent. If nothing is sent then the latest version of the resource is returned.

With this in mind, I wrote a very small gem to put this into practice to demonstrate how I thought this could be achieved which boils down to the following. Given a number (i.e. the version sent in the request) and a list of numbers (i.e. known versions with some change in an application), find the previous closest number in the list from the version number given. For example, if the date 20130101 is passed and the list contains two dates of 20130101 and 20130601, then 20130101 is returned.

def find_version_for version, list
    return list.last if version.nil?
    list.select { |i| i <= version }.last || list.first
end

Using the returned number, the server can decide how to render the response to the client as defined in my previous post.

Robustness principle

Jon Postel says "Be conservative in what you do, be liberal in what you accept from others", so perhaps we could allow our clients to do both?

Perhaps using the Accept header makes you a better denizen of the Internet, adhering to HATEAOS principles but I think using a version request parameter makes for a better Web experience. As a developer, I want to be able to put a URI into a browser and see a response rendered and for that I'd lean more towards the version parameter.

This approach can be generalised across other information sent from a client to a server. Clients like web browsers send information in every request, and these should honored as the defaults. The information sent includes what language, data and encoding formats the client would like. Using request parameters can offer overrides and support compatibility between different types of clients who want to use an API and cool bookmarkable URIs.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

Stop leaky APIs

Robbie Clutton
Wednesday, May 15, 2013

There are many blogs about how to expose an API for a Rails application and many times I look at this and am concerned about how these examples often leak the application design and the schema out through the API. When this leak occurs a change to the application internals can ripple out and break clients of an API, or force applications to namespace URI paths which I feel is unnecessary and ugly.

When the only consumer of application data models are the views within the same application then the object design can be fluid and malleable. Once an application exposes an API to more than one client, and especially if that client is on a different release cycle to the server, such as iPhone application, data models become rigid. Rails tends discouraged N-tier architecture to the benefit of development speed but APIs are contracts between a server and it’s client and can be difficult to change once they start being used.

Passing an object into the Rails JSON serialisation methods will work for a time, but relying on this will only get you so far. At some point a refactor will take place that will cause a breaking change. It could be something simple such as renaming a column, moving responsibilities from one class to another or adding extra meta-data to a response. Either way, adding this information into your model class starts to place more responsibilities into one place.

There are a few ways out of this potential issue. Let’s take a look at the classic blog application and its Post object. The Rails rendering engine will call as_json on an object if the request has sent the content-type of application\json to the server. Here we override the implementation from ActiveRecord to provide a stable, known version:

def as_json(options={})
    {
        author_id: author.id
        title: title
    }
end

A second option is to model the object explicitly and serialise the internal model into a public representation. We can duck-type the object to respond how ActiveRecord objects behave during a serialisation call. Although this can be seen as a step towards a N-tier architecture, it’s also a step towards service dependent abstraction:

class Api::Post
  attr_reader :post

  def initialize(post)
    @post = post
  end

  def as_json(options={})
    {
      author_id: post.author.id
      title: post.title
    }
  end
end

The benefit of doing this is a separation of concerns between your data model and the data presentation. An application model doesn’t need to know how it’ll be represented by an API, command line interface or any other outside communication mechanism. If an application were tending more towards HATEOAS for instance this separation could help resolve hyperlinks relevant to the interface. You may lose some of the Rails respond_with goodness with this:

respond_to :html, :json

def show
  post = Post.find(params[:id])
  respond_to |format| do
    format.html { @post = post }
    format.json { render json: Api::Post.new(post) }
  end
end

That can be regained with the help of a presenter:

respond_to :html, :json

def show
  post = Post.find(params[:id])
  @presenter = PostPresenter.new(post)
  respond_with @presenter
end

Where PostPresenter may look something like:

class PostPresenter < SimpleDelegator
  def as_json(options={})
    Api::Post.new(self).as_json(options)
  end
end

What’s the difference between this and putting the as_json method into Post directly? More control, separation of concerns with application modeling vs presentation and the big win is when breaking changes occur within the API. Now we can put version relevant information into new objects, or into the serialised class itself.

class Api::Post
  attr_reader :post, :version

  def initialize(post, version)
    @post = post
    @version = version
  end

  def as_json(options={})
    send("v#{:version}")
  end

  private
  def v20130505
    # version specific JSON
  end

  def v20121206
    # version specific JSON
  end
end

Through this we have versioning information in one place and through a request parameter of something like v=20130506 the application can handle multiple versions in one object. For me, this ultimately removes URIs like /v1/posts, but why is that important? The URI is an identifier which points to a resource and having v1 or v2 in the URI muddies the fact that the two identifiers are pointing to the same resource. Using a request parameter, much like pagination is handled, means we can ask for a representation of that resource rather than having to specify different resources. Then we can do away with needing controllers such as Api::V1::PostsController and just deal with Api::PostsController or even just PostsController and deal with the versioning within the object instead of the URI path.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

ElementalJS and SimpleBDD open source updates

Robbie Clutton
Tuesday, May 14, 2013

Thanks to the benefits of Open Source software and working with great people, I’m pleased to announce some updates to both ElementalJS and to SimpleBDD.

ElementalJS

Thanks to Ian Zabel who made a performance improvement to ElementalJS after fighting a large DOM in Internet Explorer. Elemental will now load the behaviours much quicker if the document is passed as no filtering will take place. If another node is passed, filtering will be applied but the thought is that DOM will be much smaller so hopefully won’t hit this issue.

SimpleBDD

Thanks to Adam Berlin who made two improvements to SimpleBDD. First was the addition to also to the syntax and second was NoMethodError is replaced by pending if using RSpec.

More improvements were made by Daniel Finnie which allow use of some non alphanumeric characters that get turned into method names.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

Stop leaking ActiveRecord throughout your application

Robbie Clutton
Monday, May 6, 2013

Extending ActiveRecord::Base leaks a powerful API throughout an application which can lead to tempting code which breaks good design. Take the classic blog example where you may want to retrieve the latest posts by a given author. You may have seen, or even written code that gets the dataset you need straight into the controller or view:

Post.where(author_id: author_id).limit(20).order("created_at DESC").each { ... }

For me this is a design violation as well as breaking the “Law of Demeter”[Edit: Current Pivot Adam Berlin and former Pivot John Barker pointed out that chaining with the same object was not a Demeter violation]. The example above tells me structure of the schema that the calling class has no business knowing. It also makes testing using stubs ugly and encourages testing against the database directly. A test would have to chain three methods to stub a return value. It’s brittle, as in it’s susceptible to breaking due to changes outside of the class. For me it also fails from a narrative perspective in that it doesn’t succinctly reveal the intent of this part of the application.

If we were testing this and attempting to use stubs, we’d have to write something like the below. You can see how this is at best cumbersome, but also fragile.

where = stub(:where)
limit = stub(:limit)
order = stub(:order)

Post.stub(:where).with(author_id: author_id) { where }
where.stub(:limit).with(20) { limit }
limit.stub(:order).with("created_at DESC").and_yield(post1, post2, post3)

You may be forgiven for thinking you could chain the stubs like below, but the arguments are ignored and this just serves to highlight the breaking of the ‘Law of Demeter’.

Post.stub_chain(:where, :limit, :order).and_yield(post1, post2, post3)

I’d much rather see that as a message to the Post class.

def self.latest_for_author id
  where(author_id: id).limit(20).order("created_at DESC")
end

Post.latest_for_author(1)

If there were variations of the limit and perhaps offset, they can be passed as option parameters of as an options hash:

def self.latest_for_author id, limit = 20, offset = 0
  where(author: id).limit(limit).offset(offset).order("created_at DESC")
end

Post.latest_for_author(1)
Post.latest_for_author(1, 20, 0)

or

def self.latest_for_author id, options
  limit = options[:limit] || 20
  offset = options[:offset] || 0
  where(author: id).limit(limit).offset(offset).order("created_at DESC")
end

Post.latest_for_author(1, offset: 20)

In order to get the dataset the call looks like the following, and I think is more informative than using the ActiveRecord DSL directly.

Post.latest_for_author(author_id).each { ... }

Testing is also easier, as it puts more emphasis on the messages being sent to objects rather than a chain of calls having to be correct.

Post.should_receive(:latest_for_author).with(1).and_yield(post1, post2, post3)

There are a few advantages to this refactor:

  • Only the Post class knows about the schema
  • Any changes to the implementation of what latest_for_author are encapsulated in one place
  • The method describes the intent more than the implementation
  • Stubbing in the tests are easier as there is one clear dependency
  • Testing the database is encouraged only in the class hitting the database

One further refactor could be done here, and that is to move the query logic out of the Post class once more, but this time into a purpose built query Object:

class LatestPosts
  attr_reader :author_id

  def initialize author_id
    @author_id = author_id
  end

  def find_each(&block)
    Post.where(author_id: author_id).limit(20).order("created_at DESC").find_each(&block)
  end

end

Where using the class looks like:

LatestPosts.new(author_id).find_each { ... }

Here’s what Bryan Helmkamp has to say on query objects in his excellent write up on fat ActiveRecord models. Bryan here rightfully points out that once in a single purpose object, they warrant little attention to unit testing. Now is the right time to use the database to ensure the right data set is being returned and that N+1 queries are not being performed. This means that database testing would only occur within the class actually hitting the database and not the rest of application which has a dependency on the database.

All of these techniques discussed serve to improve the design of an application by preventing leaking responsibilities from one class throughout the rest of the application. I’m also not saying that developers shouldn’t be using ActiveRecord or even Rails, but to use the tools responsibly.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

Single resource REST Rails routes

Robbie Clutton
Tuesday, April 16, 2013

REST principles by default is a fantastic convention within Rails applications. The documentation for how to route HTTP requests are comprehensive and give examples about photo resources within an application. If you’ve got photo and tag as first class resources of your application, Rails has you covered. But what if you are building an application with a focus on one type of resource, do you really want /resource_type as a prefix to all of your application paths? I certainly don’t and I’ll show you how to remove that without diverging from Rails core strenghts.

For better or worse, I’m always conscience of making sure applications I’m involved in have Cool URIs and sometimes that does mean fighting the Rails conventions. However Rails routing is very flexible and can provide me with the application paths that make me happy.

Take Twitter as an example. Every user has their username as a top level path, so instead of having /users/robb1e, they simply have /robb1e. When dealing with an application where there is one core resource it can make a lot of sense to strip the resource prefix. This can be achieved through scopes in the routing configuration.

Your::Application.routes.draw do
  scope ":username" do
    get '', to: 'users#show'
  end
end

Gives you routes which look like

           GET  /:username(.:format)                users#show

If you wanted to see the followers and followees of that user, you have two options. Return to the default resource or use HTTP verb contraints. I’ll show you both.

Your::Application.routes.draw do
  scope ":username" do
    get '', to: 'users#show'
    resource :following, only: [:show]
    resource :followers, only: [:show]
  end
end

This adds the routes

following GET  /:username/following(.:format)      followings#show
followers GET  /:username/followers(.:format)      followers#show

Alternatively HTTP verb constrains can be used to achieve a similar result.

Your::Application.routes.draw do
  scope ":username" do
    get '', to: 'users#show'
    get '/following', to: 'user#following'
    get '/followers', to: 'user#followers'
  end
end

This gives the paths

          GET  /:username/following(.:format)      user#following
          GET  /:username/followers(.:format)      user#followers

If you are trending into paths unknown, you always have the safety of tests to help you out. Both Rails and RSpec have ways to test your application routes.

One gotcha which using the default resource routing removes is clashing paths. If you decide to build an admin page and want to put that at /admin, that needs to be in the routes config before the scoped block and if a user has given themselves the name of admin then you may be in for some fun.

So the next time a need arises for an unconventional route, check the documentation, it’s probably possible although almost always warants thinking about.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

Failed attempt at trying to use refinements

Robbie Clutton
Monday, April 8, 2013

I was pretty interested in refinements in Ruby 2.0, and after listening to the latest Ruby Rouges podcast where some serious doubts were raised about the viability of refinements I thought I’d build a little example of how I was thinking I could use it.

I failed first time out and I tried copying and pasting examples without success. After some time poking around I found a blog post about why this wasn’t working. Ultimately, refinements are sort of left in the language, but not fully supported and are marked as experimental.

Here’s what I wanted to achieve. Coming from Scala in my previous job, I thought I could use refinements as a proxy for implicit conversions. Here I refine the Fixnum class to allow it to respond to a to_currency message. When called it converts the Fixnum instance to a Currency instance.

class Currency
  attr_reader :units

  def initialize units
    @units = units
  end
end

module CurrencyExtensions
  refine Fixnum do
    def to_currency
      Currency.new(self)
    end
  end
end

class App
  using CurrencyExtensions

  def initialize
    puts 3.to_currency
  end
end

App.new

Why is this interesting to me? Well, I think being able to write 3.to_currency can result in nicer to read code than the alternative Currency.new(3). Small difference perhaps.

There is a way to get this to work by having the using keyword in the global context, but it doesn’t deliver the full impact I was hoping for.

class Currency
  attr_reader :units

  def initialize units
    @units = units
  end
end

module CurrencyExtensions
  refine Fixnum do
    def to_currency
      Currency.new(self)
    end
  end
end

using CurrencyExtensions
puts 3.to_currency

I’ve got other ideas to build upon this if it ever makes it into the full specification. I’ll keep watching for now.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

NY Standup: Documentation interpolation and RVM, Brew and autolibs FTW

Robbie Clutton
Tuesday, April 2, 2013

Interestings

Grant Hutchins / David Lee

If you wrap the opening name of a Ruby heredoc in single quotes, its insides will act like a single-quoted string.

If you wrap the opening name in backticks, the heredoc will be immediately executed through system()

<<-HEREDOC
1 + 2 = #{1 + 2}
HEREDOC
=> "1 + 2 = 3\n"

<<-'HEREDOC'
1 + 2 = #{1 + 2}
HEREDOC

=> "1 + 2 = #{1 + 2}\n"

<<-HEREDOC
uname
date
HEREDOC

=> "Darwin\nMon Apr 1 17:50:43 EDT 2013\n"

Found at http://jeff.dallien.net/posts/optional-behavior-for-ruby-heredocs

rvm autolibs

https://rvm.io/rvm/autolibs/

In the latest versions of RVM, you can have RVM build dependencies automatically for you. For instance, if you use Homebrew, you can run

$ rvm autolibs brew

and from then on, RVM will automatically build any packages it needs via Homebrew.

As always, read "brew doctor" to get more info about how to set up your particular system.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

Thoughts on Simple BDD

Robbie Clutton
Saturday, March 30, 2013

A small number of projects here in New York have adopted my extremely simple behaviour driven development library, SimpleBDD, and I thought I’d share some of the emerging patterns those teams have developed while using it.

SimpleBDD, is a way of using a Gherkin like specification language and turning into a method call with Ruby. It takes the approach of tools like Cucumber but reduces it down the smallest set of features. Essentially taking a method call:

Given "a logged in user"

and calling an underlaying function:

a_logged_in_user

If that method is in the scope of the executing test, that method is executed, if the method isn’t in scope or doesn’t exist the standard method not found exception is raised. It enables a developer to produce Gherkin like specifications while staying in Ruby, using the test framework of choice.

I generally try and follow the advise of my colleague, Matt Parker, with his excellent post on steps as teleportation devises. We try and create a reusable and stateless domain specific language (DSL) for our tests and our steps call into the DSL and hold state pertinent to that particular test run.

At first on my current project we have had three separate areas for our request specs. We have the request spec itself which used SimpleBDD to describe the behavior of the application. We then had a ‘steps’ file which had the methods calls from the SimpleBDD and translated those into the reusable DSL for the application. The steps file was reused across all request specs and was becoming big pretty quickly.

Dirk, on another project which is also using SimpleBDD, skipped the ‘steps’ file and placed those methods straight into the rspec feature files underneath the scenario blocks. Then after some discussions with JT on where to keep the state our tests depended on, Brent and our team started using rspecs ‘let’ methods and the ‘steps’ within the scope of the feature block to keep the intention of the test in one place.

By also putting more responsibility onto the DSL, these methods are pretty dumb, leaving the test describing what the test is attempting to achieve through SimpleBDD method calls and the how through calling the DSL via the ‘step’ methods within the feature block in the same file.


require 'spec_helper'

feature 'homepage' do

  scenario 'happy path' do
    Given "an existing user with one widget"
    When "the user visits the homepage"
    Then "the user can see their widget"
  end

  let(:user) { FactoryGirl.create(:user) }
  let(:widget) { FactoryGirl.create(:widget) }

  def an_existing_user_with_one_widget
    user.widgets << widget # Using the application code to create initial state
    login user # Application DSL
  end

  def the_user_visits_the_homepage
    visit root_path # Capybara DSL
  end

  def the_user_can_see_their_widget
    can_see_widget widget # Application DSL
  end

end

This has been working out fairly well for us and if you're interested in a simple version for BDD for your project I hope you check this out.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

Surprisingly Simple Epic Wins

Robbie Clutton
Saturday, March 16, 2013

A surprising amount of simple can get an application over a number of speed bumps. We’re going to look up and down the whole application stack and use stories to show what simple things people have done to build a sustainable system without re-architecting.

You’re gonna need a bigger boat

One of my favourite stories from a colleague was when they consulted for a previous company. The application was struggling to scale and the answer from the development team was vertical scaling. Buying bigger servers, or putting in more RAM would buy more time for the application to keep ticking over until next time. My diligent colleague when joining this team spent some time digging around trying to find bottlenecks in the application. They discovered that there was not a single index on the database and that pretty much every transaction was doing a full table scan to get the result. The previous answer to get more RAM was so that the entire database could fit within it, in an effort to increase the performance.

OK, so no indexes at all seems like an extreme case, but it can be easy to skip an index or over index. Small companies don’t tend to have database administrators and if a test isn’t failing and no-one is complaining in production it’s easy to see that this area can be skipped without realising. There are some tools out there like ‘Rails Best Practices’ which can help identify where indexes are missing. Some simple changes and checks and drastically improve performance and delay that re-architecting we’re all afraid (or excited depending who you are) to do.

Instrument, Instrument, Instrument

A different colleague started their previous job just as a major rewrite was nearing its completion. There was some stress as a move from ColdFusion to Ruby was not paying off the dividends the team had sold to the product owners; the application performance wasn’t good enough to go live with. Tests were green and no bugs were reported so no light was shed on the troubled areas of the application. By adding instrumentation using a tool such as NewRelic, slow processes and queries were found and refactored. Over the course of a week, working on and off the problems the performance was brought up to an acceptable level where the application could go live.

In a way, this was not a terrible position to be in. Performance is one of those things that it isn’t a problem until it is, and just before pushing a release out of the door seems like an ideal time to do some performance testing. The tool NewRelic itself can be used locally for this and can run as a hosted service against real production requests. On teams I’ve been involved in, I like to go through the slowest requests on and identify and fix any problem areas as a Friday afternoon chore. Instrumentation doesn’t have to come through a tool like NewRelic, it can be looking at logs of web request times and slow database queries, but taking some time to fix these can make some significant improvements.

Caching in

There are a number of caching techniques I’ve heard about and seen. Some have been effective, others have created more problems they had set out to resolve. First a cautionary tale.

Like scaling a database by putting the entire database into memory, caching can obscure underlying issues. On a recent project caching mechanisms were scattered through the code, where to cache, where to invalidate. This meant that when caching something, either through the application or even changing code, we couldn’t be sure if the caches would be affected.  In one instance, our production environment was showing some strange behaviour that we could not replicate. After digging around, we found that the caches were being invalidated by the last update of the object being cached, but we had changed the template within the cache, leaving some objects being presented with the old template, and others with the newer one.

Phil Karlton’s quote “there are only two hard things in Computer Science: cache invalidation and naming things” springs to mind.  The lesson here is caching can increase performance significantly but can hide issues. By caching the result of slow running code, are you hiding code that could be improved?

Rails 4 has tried to solve some of these issues by suggesting that applications generally only cache the results of rendered views. It also takes away the cache invalidation part by using the objects name, last updated time and the MD5 of the template being rendered as part of the keys. Using a caching system which automatically drops the least used cached entries should be sufficient to deal with this.

Caching out

Sometimes the responsibility of caching can be handed to another part of an applications’ infrastructure. This is exactly what we had done on some projects when I was working on the Guardian. The applications we were writing depended heavily on external services for data and these services, being good citizens of the web, had returned appropriate cache headers in the HTTP responses. For this given application, we didn’t want to model the data coming back, we merely wanted to transform the response and place in a HTML template. Using a HTTP caching proxy like Squid installed on the same server as the application making outbound calls meant we could rely on Squid to do the caching. There was the HTTP request out, but as this never left the server, it was a small hit.

201 Created

Donald Knuth said that “premature optimization is the root of all evil” but there are a series of small optimisations that can be worthwhile. When making a request to a web server or external service, an application is either changing or reading state, sometimes both at the same time. When reading back state at the same time in the same request as changing it, if an application performs only the work necessary for the response and puts the rest of the work in a background job of some implementation, the application can respond more quickly. RFC 2616 HTTP response codes 201 and 202 were made for this sort of operation.

The response code 201 is useful for letting the client know a resource has been created. In one of my first projects in telecommunications, we would send a 201 to indicate a phone call had been started. The client would request a call be made between two phone numbers, but we didn’t want the request to be tied up during the actual phone call and maybe returning only when the call was finished.  A 201 with a location header for the client to get the status of a call was an idea choice. A resource (the call) had been created and had an address which the client could use.

A more web based example could be signing users up to a new website. If there are welcome emails to be sent and mailing lists to be joined it’s not in the applications interest to make the user wait while SMTP gateways respond and third party services give their OK to a request. If this is spun out into a thread or background job the application can return to the user and allow them to carry on. The less essential processes will happen, but the delay that occurs is acceptable and the user gets a more performant experience.

Perceived performance

Steve Saunders and the Performance Group at Yahoo have done great work with tools like YSlow and highlighting the issues with perceived performance. When discussing this issues, Saunders said “Optimize front-end performance first, that’s where 80% or more of the end-user response time is spent”. So much of the advice YSlow suggests is so simple to implement that I recommend just setting it up at the beginning of the project and checking the advice every now and then.  Some of the suggestions are one off and can be done at the start of a project, others require some ongoing checking, but these techniques will give a faster experience for the end users.

An interesting example is where this has been taken to an interesting level has been the recent rewrite of the Guardian mobile website.  The application focuses on the most important aspect: the content. Javascript has been stripped back to only what is required to load the content, there’s no jQuery in sight.

The main Guardian website with an empty cache downloaded 1.06mb with 212 requests. It took 2.5 seconds for the DOM to download it’s content and 4.3 seconds before ‘onload’ was fired.  For the mobile version of the website, it downloaded 25k of data with 77 requests with the DOM content load event happening after 260ms and ‘onload’ being fired after 950ms.  That’s a pretty big difference and sometimes that effort is warranted.

Another technique used by the new Guardian mobile website is conditional loading of content. Say on a sports team page, after the main content has been downloaded and the reader is celebrating or sobbing depending on their team news, asynchronous calls are made for extra content. In this case, fixture/schedule information, results and related content. This information isn’t required for the reader to achieve the main purpose of their visit to the page, but it might help keep them there. Using conditional loading, the page loads quickly, the reader can start reading the article without having to wait longer for a bigger DOM or synchronous loading of the extra content.

Real-time vs Near-time

One important I’ve personally learnt is when asked to build something in ‘real-time’, it to ask the product owner to define real-time. Real-time can mean something very different to a developer than to a product owner or user. For the longest time I equated real-time with ‘immediately’. When I discovered the actual requirement was ‘within a reasonable time but not necessarily immediately’, this changed many assumptions I’d made about the application.

A company that perform complex calculations to give a user a view to potential savings when switching to a different service provider, the product team had requested the price update in real-time. The developers knew that crunching the numbers can actually take a non-trivial amount of time, so they put the processing in a background job and used the HTTP meta-refresh element to refresh the page and ultimately hold the user in place, followed by a redirect to a page with the crunched numbers.

This may sound crude, but when considered against the time to build a better, more ‘real-time’ experience, it was an easy choice. Consider too that this pattern is often employed during a checkout experience, especially when booking something like a flight. After your credit card information is taken, you’re taking to a holding page until the payment is confirmed and a seat reserved.

Meanwhile, in another part of town, a team was content with itself after building a ‘real-time’ application that displayed government bond finances and gave 10 second updates. They relied on a message queue publish-subscribe architecture to get the very latest information to the users. When they went out to watch the application in the wild, they found that those people watching these bonds would take a look every few minutes in-between doing other tasks. It turns out that working on bonds is not the same as working on the stock exchange. The application had been over-engineered for a problem that didn’t exist. After realising this, they could correct their course and remove complexities from the application while remaining ‘near-time’ for their users.

Take a breath

We’ve looked at some simple changes that can bring big benefits and how user perception of performance is probably more important that server side performance. We’ve also seen how asking the right questions can lead to simplicity in itself.

Step two to a successful business: know the product, embrace the tools that show weaknesses and learn where best to invest your time.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter
Robbie Clutton

Robbie Clutton
New York

Subscribe to Robbie's Feed

Author Topics

bookmark (1)
bookmarklet (1)
iframe (1)
ironblogger (12)
javascript (4)
api (2)
rails (5)
versioning (1)
architecture (1)
object-design (2)
rest (2)
elementaljs (2)
opensource (1)
simplebdd (2)
activerecord (2)
routing (1)
refinements (1)
ruby (5)
scala (2)
autolibs (1)
brew (1)
documentation (1)
rvm (1)
bdd (3)
rspec (4)
testing (8)
lean (2)
startup architecture (2)
sustainable archtiecture (1)
yagni (1)
metrics (1)
validation (1)
nosql (1)
nulldb (1)
ci (2)
jasmine (1)
build (1)
agile (1)
review (1)
web (1)
  • About
  • Case Studies
  • Team
  • Community
  • Careers
  • Contact
  • Labs
  • Events

Contact Us

contact@pivotallabs.com
+1 415-77-PIVOT
TwitterLinkedInFacebook

Pivotal Tracker

Tracker is the award-winning agile project management tool that enables real-time collaboration around a shared, prioritized backlog.
Visit pivotaltracker.com >