Pivotal Labs

Main menu

Skip to primary content
Skip to secondary content
  • About
  • Case Studies
  • Team
    • Executives
    • Locations
      • San Francisco (HQ)
      • Boston
      • Boulder
      • Denver
      • London
      • Los Angeles
      • New York
  • Community
    • Blogs
    • Tech Talks
    • Events
  • Careers
    • Lifestyle
    • Principles & Practices
    • Benefits
    • FAQ
    • Apply
  • Contact
    • Press Room
    • Press Releases
    • In The News
    • Press Kit
  • All
  • Labs
  • Standup
  • Tracker

Better array assertions using collect

Pivotal Labs
Monday, October 27, 2008

You could do this:

Person.tall_people.length.should == 3
Person.tall_people[0].should == people(:linda)
Person.tall_people[1].should == people(:dwane)
Person.tall_people[2].should == people(:rick)

This is better, because it’s clearer, and because in one stroke you prove bounds, content, and order:

Person.tall_people.should == [
  people(:linda),
  people(:dwane),
  people(:rick)
]

I argue that this is best:

Person.tall_people.collect{|p|p.first_name}.should == [
  "Linda",
  "Dwane",
  "Rick"
]

Failures messages are a pleasure to read

  expected: ["Linda", "Dwane", "Rick"],
  got: ["Juliette", "Jeanne"] (using ==)

And code is clearer overall (the price is a “collect”)

Collecting away from the original object (and into primitives) is not only clearer, in most cases your aim is not to re-prove that the element objects are fully and properly configured. You’ve already done that elsewhere. You only want to prove, in the simplest possible terms, that the group of things you got is what you expected to get.

Let’s say you don’t intend to care about order. Sorting on primitives is a snap:

Person.tall_people.collect{|p|p.first_name}.sort.should == [
  "Dwane",
  "Linda",
  "Rick"
]

Slightly more controversial:

Person.short_people.collect{|p|p.first_name}.should == []

The collect seems silly at first glance, but you’re making present and future assertion failures much friendlier. You’ll be happy about brain cycles saved and sanity kept during big refactorings.

  • 0 Shares
  • Share on Facebook
  • Share on Twitter

15 Comments

  1. Pat Nakajima says:

    I couldn’t disagree more with this approach.

    First, my approach to testing the `short_people`:

    `Person.should have(3).short_people
    Person.short_people.should include(people(:linda), people(:dwane), people(:rick))`

    The reason I’m so adamantly against using `collect` in testing collections
    is that it obfuscates the intent of the test. If you map by `first_name`, for example,
    then how is the reader supposed to know that `first_name` has nothing to do
    with what you’re testing. How is the reader supposed to know that it’s irrelevant to
    the intent of the test? You’ll end up using *more* brain cycles trying to figure out
    why the collection is being mapped. Mapping the result is irrelevant, and can even lead to
    red herrings.

    Granted, ActiveRecord objects can lead to some gnarly test failure messages, but
    that should be no reason to change how you write your tests. Instead, overriding
    `ActiveRecord::Base#inspect` can clean up your failures messages, but still allow
    you to write tests that better convey their intent to the reader.

    Mapping a collection that is the result of a test not only obfuscates
    the test’s intent, but also leads to more brittle tests. What if, for
    some reason, you decided to do away with the `first_name` method on the
    objects in the collection. You’d be faced with an unrelated code change
    that broke tests.

    If tests were meant to be written with the purpose of creating pretty failure
    messages, I’d be absolutely onboard with you. I don’t think that’s the point though,
    and as a result, I think `collect` is a mistake to add to tests.

    October 28, 2008 at 6:42 am

  2. Pat Maddox says:

    *Pat #2 tags in*

    I’ve been thinking about this off and on for a while. Tonight I finally came up with a way I like:

    [1,2,3].should == unordered(3,1,2)

    What do you guys think? It’s just shoulda’s assert_same_elements. Code is at http://github.com/pat-maddox/rspec/tree/unordered

    October 28, 2008 at 7:16 am

  3. Pat Maddox says:

    > Granted, ActiveRecord objects can lead to some gnarly test failure messages, but that should be no reason to change how you write your tests

    I don’t agree with that. Steve makes a very interesting point about slightly sacrificing code clarity in favor of clarity in the failure messages. I think it should be taken a bit further though. Instead of choosing between gnarly failure messages and noisy examples, you should improve the given failure messages, or build a clear abstraction over the ugly code that provides pretty failures. Improving the generated messages is more robust, but also more work, because you’d have to create special formatters and handle several conditions. Abstracting the existing mechanism is simple to implement, but because the underlying code was not written with this abstraction in mind, it can evolve independently and no longer be sufficient for the abstracted expression. But I’d say that for stuff like RSpec’s operator matchers, which are stable, it’s an efficient technique for improving the test over all.

    October 28, 2008 at 7:36 am

  4. Brian Takita says:

    @Pat – Aww, you beat me to it. On Casebook, we added the include_all matcher.

    [1,2,3].should include_all(3,1,2)

    The code is not available publicly, at this time. We also made the error message more explicit.

    October 28, 2008 at 1:38 pm

  5. Karl says:

    You are missing the even more controversial:

    Person.short_people.collect{|p|p.reason_to_live}.should == []

    Sorry, I just couldn’t resist. Gotta love Randy Newman.

    October 28, 2008 at 4:28 pm

  6. Pat Maddox says:

    @Brian – The thing I don’t like about that is that I would expect

    [1,2,3,1].should include_all(3,1,2)

    to be true. But that’s not the behavior we’re going for in this case. unordered is the best I’ve been able to come up with, but David doesn’t seem to like the name.

    October 28, 2008 at 4:40 pm

  7. Steve says:

    “The reason I’m so adamantly against using collect in testing collections is that it obfuscates the intent of the test. If you map by first_name, for example, then how is the reader supposed to know that first_name has nothing to do with what you’re testing. How is the reader supposed to know that it’s irrelevant to the intent of the test? You’ll end up using more brain cycles trying to figure out why the collection is being mapped. Mapping the result is irrelevant, and can even lead to red herrings.”

    You’re right Pat, we fundamentally disagree.

    I expect you to look at a collect to first_name and think “hmm, all we care about is that Linda, Dwane, and Rick are in the array”. Or at least, after having had a conversation like this, you would find it unsurprising and maybe even reasonable.

    Or to put it another way, I claim that tests that involves needlessly complex objects, when primitives will do, are less optimal because they turn the reader’s attention away from the important point (in this case, the content of the array). I wouldn’t claim that using full objects is in any way bad, or that (god forbid) what I’m proposing is a “best practice”. Just that overall I find the benefits to exceed the cost of the collect statement. You may not.

    Also, the problem with overriding inspect is that – especially with model objects – what you want to see on inspect is too context-dependent. Consider an import test where I care about how each attribute of the model turns out, vs something like the above.

    “Granted, ActiveRecord objects can lead to some gnarly test failure messages, but that should be no reason to change how you write your tests.”

    I can think of few better reasons to change how I write my tests. Surely, even for you, when you evaluate how you will express something in the form of a test, the setup code, assertions, and error messages are all important factors, and part of a whole, where you try to get the best reading and execution experience by playing with all three. Right?

    October 28, 2008 at 6:02 pm

  8. Jeff Dean says:

    I generally prefer custom matcher to mapping because I find tests with less punctuation to be easier to read.

    include_all’s main intent is to give you a very detailed error message, including what’s missing and what’s there but shouldn’t be. It prints each member of the array on a separate line, so it’s easy to quickly identify the offending entries without having to horizontally scroll through the output.

    @pat maddox: I agree that include_all is not the best name. It’s worked very well at allowing us to quickly identify the differences in arrays of complex objects.

    I think to make array matchers better, we could:

    * Have the output of our matchers look for an `rspec_inspect` method that, if defined on that AR class from a spec_helper, would show you that result (so you could define `rspec_inspect` to return `:first_name` for people, if you wanted but keep it in one place and out of your code)
    * Add 2 new matchers, one that asserts that it contains exactly the same elements (unordered), and one that asserts that it contains at least some elements

    It might look like this:

    # in some spec helper
    Person.class_eval { def rspec_inspect; first_name; end }

    # the matchers
    Person.tall_people.should contain_exactly(
    people(:linda),
    people(:dwane),
    people(:rick)
    )

    Person.tall_people.should contain_at_least(
    people(:linda),
    people(:rick)
    )

    # the output
    expected
    ['Linda',
    'Dwane']
    to include
    ['Darrel']
    not to include
    ['Rick']

    Pat Maddox’s matcher file can be found [here][2], and I posted the _include_all_ matcher Brian referred to as a [gist][1] if you want to take a look.

    [1]: http://gist.github.com/20597

    [2]: http://github.com/pat-maddox/rspec/tree/unordered/lib/spec/matchers/unordered.rb

    October 29, 2008 at 3:18 am

  9. Adam Milligan says:

    I have to say I agree pretty strongly with Pat N on this. Not only does the collect change what the test communicates, but it also actually tests the wrong comparisons. Objects define their own meaning for comparison, which we should respect.

    Person.tall_people should return a collection containing Person objects. To verify that this contains the appropriate people, we should compare to an expected list of Person objects, using Person#== (or, perversely, Person#eql?). Comparing attributes of the expected objects, which have a different type and therefore potentially different comparison semantics, opens the door for incongruities.

    This is a subtle point, and one not likely to come up, but I think it mirrors the conceptual disconnect that this example creates.

    All of this assumes, of course, that you can guarantee that Person#first_name returns (and will always return) a unique and non-nil value for each individual Person instance.

    October 30, 2008 at 5:49 am

  10. Jeff Dean says:

    Hopefully better array matching will be a part of rspec soon – you can follow the progress at [the lighthouse ticket](http://rspec.lighthouseapp.com/projects/5645/tickets/591-added-contain_exactly-matcher)

    November 3, 2008 at 3:23 am

  11. Jim Kingdon says:

    Adam: my biggest problem with relying on == (or equals in java, or similar mechanisms) is that many applications have at least two interesting notions of object “equality”. (When working on mayfly this is particularly amusing as one of the kinds of equality, SQL =, isn’t even reflexive).

    Trying to have inspect (or toString in java, or the like) do the right thing in all contexts has somewhat similar problems, although perhaps not as severe (especially in ruby which has both to_s and inspect).

    The extra collect doesn’t really bother me, but I guess if there was another way of getting good failure messages I suppose I could live with that.

    November 3, 2008 at 3:21 pm

  12. Adam Milligan says:

    @Jim: Inconsistent object equality is a more fundamental problem than good or bad test failure messages; it’s a problem with your domain model. Solve the problem, not the symptom. I realize there are, as in all things, instances in which you just have to live with suboptimal code, but basing a general principle specific instances of brokenness leads to madness.

    However, on a vaguely related note, this brings up one of the, in my opinion, major flaws in Ruby: the difference between #equal?, #eql?, and #==. Sure, they’re spelled differently, but they’re all pronounced the same; two mean object equality, one means pointer equality (yes, Ruby uses pointer semantics, no matter what you call it). How many people know which is which?

    November 6, 2008 at 2:12 pm

  13. Steve Conover says:

    “Not only does the collect change what the test communicates, but it also actually tests the wrong comparisons. Objects define their own meaning for comparison, which we should respect.”

    I will disagree head-on with this statement. I don’t accept that a method we’ve come to call “equals” or “==” or whatever is superior to “first_name” in this situation.

    I care about establishing that the identity of the object is the same, and I want my failure messages to read nicely. I don’t want all equality to be based on first_name for obvious reasons, but local to this test I can assume first_name’s are unique.

    Or to put it another way, I wholeheartedly accept critiques on style (as many commenters have put forward). Critiques based on some presumed deeper normative rules are just taste preferences in disguise. There’s nothing about the effectiveness of the test that changes by using object equality. (I actually think this is an important point. A better test does not over-prove: it proves the intended outcomes as precisely as possible and tries not to fail for reasons unimportant in that context).

    November 6, 2008 at 9:38 pm

  14. Adam Milligan says:

    Good tests don’t over-prove, but nor do they under-prove. This code:

    Person.tall_people.collect{|p|p.first_name}.should == ["Linda"]

    is meaningless if your database contains ten people with the first name Linda, only one of which is tall. Each Person object is distinct, and will describe itself as distinct from others when asked appropriately, but the first_name comparison will fail to draw that distinction.

    This may seem over-cautious, but who hasn’t had fixture data change under their feet?

    November 11, 2008 at 11:15 pm

  15. Pat Nakajima says:

    Just for fun, a ridiculously overreaching yet minimally applicable solution:

    class Person
    class << self
    attr_accessor :tall_people
    end

    self.tall_people = []

    attr_reader :name

    def initialize(name=nil)
    @name = name
    end
    end

    require ‘rubygems’
    require ‘spec’

    describe Person do
    attr_reader :timmy, :linda

    before(:all) do
    Object.class_eval do
    def linda?
    name == ‘Linda’ rescue false
    end

    def has_linda?
    any? { |person| person.linda? }
    end
    end
    end

    before(:each) do
    @timmy = Person.new(“Pat”)
    @linda = Person.new(“Linda”)
    end

    it “knows if person is linda” do
    timmy.should_not be_linda
    linda.should be_linda
    end

    it “knows if linda is in collection” do
    Person.tall_people.should_not have_linda
    Person.tall_people << linda
    Person.tall_people.should have_linda
    end
    end

    November 12, 2008 at 8:37 am

Add New Comment Cancel reply

Your email address will not be published.

Pivotal Labs

Pivotal Labs

Recent Posts

  • Does the set of all sets contain itself?
  • Standup 3/8/2012
  • Standup 3/7/2012
Subscribe to Pivotal's Feed

Author Topics

riddles (1)
agile (167)
capistrano (2)
rails (26)
movember (1)
git (10)
railsdoc (1)
object-design (1)
bdd (3)
cucumber (3)
linkedin (1)
oauth (1)
ruby (17)
tdd (2)
lvh.me (1)
rails 3.1.1 (1)
selenium (6)
homebrew (1)
mysql (5)
rvm (1)
sproutcore (1)
paperclip (2)
pry (1)
amazon (1)
heroku (1)
rails3 (2)
jasmine (3)
design (3)
process (12)
productivity (8)
learning (1)
olin (1)
migrations (2)
mongodb (2)
devise (2)
javascript (13)
rubymine (4)
ipad (1)
whurl (1)
head.js (1)
pairing (2)
tools (4)
pair programming (1)
rspec (10)
rspec2 (1)
ruby19 (1)
incubation (3)
startup (5)
api (1)
presenter (1)
vanna (1)
pivotal tracker (5)
capybara (1)
fakeweb (1)
webmock (1)
intern (1)
ruby on rails (25)
meetup (1)
textmate (1)
testing (20)
solr (4)
nyc-standup (11)
community (1)
opensource (3)
activerecord (4)
chrome (1)
mp4 (1)
activeresource (1)
flash (3)
neo4j (1)
nginx (1)
rsoc (1)
meta programming (1)
agile standup (7)
government (3)
webos (4)
xss (1)
jquery (1)
bundler (2)
ci (3)
gems (5)
postgresql (1)
geminstaller (1)
gemcutter (1)
cloud (2)
rack (2)
refraction (1)
gem (5)
refactoring (1)
validations (1)
webrat (1)
engine-yard (1)
firefox (2)
jsunit (1)
mongrel (2)
thin (1)
unicorn (1)
facebook (1)
rubygems (5)
jruby (1)
actioncontroller (1)
rails 2.3 (1)
palmpre (1)
autotest (1)
mac (2)
hosting (1)
goruco (11)
database (3)
railsconf (11)
gogaruco (4)
deployment (4)
github (1)
ie (1)
ajax (1)
intellij (1)
json (1)
asset packaging (1)
polonium (1)
character encoding (1)
utf-8 (1)
test (3)
civics (1)
hpricot (1)
rake (3)
sms (1)
unicode (1)
iphone (1)
java (1)
safari (1)
memory leaks (1)
rr (3)
editor (1)
css (1)
nyc (3)
performance (5)
fun (5)
enterprise rails (1)
health (1)
new and cool (1)
general (2)
treetop (1)
errors (1)
stack (1)
trace (1)
cache (1)
cookies (1)
freesoftware (1)
conferences (1)
development (1)
driven (1)
proxy (1)
caching (1)
peertopatent (1)
languages (1)
rest (2)
rubyforge (1)
sake (1)
file (1)
upload (1)
constants (1)
osx (1)
terminal (1)
pairprogramming (2)
  • About
  • Case Studies
  • Team
  • Community
  • Careers
  • Contact
  • Labs
  • Events

Contact Us

contact@pivotallabs.com
+1 415-77-PIVOT
TwitterLinkedInFacebook

Pivotal Tracker

Tracker is the award-winning agile project management tool that enables real-time collaboration around a shared, prioritized backlog.
Visit pivotaltracker.com >