Steve ConoverSteve Conover
Better array assertions using collect
edit Posted by Steve Conover on Monday October 27, 2008 at 10:00AM

You could do this:

Person.tall_people.length.should == 3
Person.tall_people[0].should == people(:linda)
Person.tall_people[1].should == people(:dwane)
Person.tall_people[2].should == people(:rick)

This is better, because it's clearer, and because in one stroke you prove bounds, content, and order:

Person.tall_people.should == [
  people(:linda), 
  people(:dwane), 
  people(:rick)
]

I argue that this is best:

Person.tall_people.collect{|p|p.first_name}.should == [
  "Linda", 
  "Dwane", 
  "Rick"
]

Failures messages are a pleasure to read

  expected: ["Linda", "Dwane", "Rick"],
  got: ["Juliette", "Jeanne"] (using ==)

And code is clearer overall (the price is a "collect")

Collecting away from the original object (and into primitives) is not only clearer, in most cases your aim is not to re-prove that the element objects are fully and properly configured. You've already done that elsewhere. You only want to prove, in the simplest possible terms, that the group of things you got is what you expected to get.

Let's say you don't intend to care about order. Sorting on primitives is a snap:

Person.tall_people.collect{|p|p.first_name}.sort.should == [
  "Dwane", 
  "Linda", 
  "Rick"
]

Slightly more controversial:

Person.short_people.collect{|p|p.first_name}.should == []

The collect seems silly at first glance, but you're making present and future assertion failures much friendlier. You'll be happy about brain cycles saved and sanity kept during big refactorings.

Comments

  1. Pat Nakajima Pat Nakajima on October 28, 2008 at 06:42AM

    I couldn't disagree more with this approach.

    First, my approach to testing the short_people:

    Person.should have(3).short_people Person.short_people.should include(people(:linda), people(:dwane), people(:rick))

    The reason I'm so adamantly against using collect in testing collections is that it obfuscates the intent of the test. If you map by first_name, for example, then how is the reader supposed to know that first_name has nothing to do with what you're testing. How is the reader supposed to know that it's irrelevant to the intent of the test? You'll end up using more brain cycles trying to figure out why the collection is being mapped. Mapping the result is irrelevant, and can even lead to red herrings.

    Granted, ActiveRecord objects can lead to some gnarly test failure messages, but that should be no reason to change how you write your tests. Instead, overriding ActiveRecord::Base#inspect can clean up your failures messages, but still allow you to write tests that better convey their intent to the reader.

    Mapping a collection that is the result of a test not only obfuscates the test's intent, but also leads to more brittle tests. What if, for some reason, you decided to do away with the first_name method on the objects in the collection. You'd be faced with an unrelated code change that broke tests.

    If tests were meant to be written with the purpose of creating pretty failure messages, I'd be absolutely onboard with you. I don't think that's the point though, and as a result, I think collect is a mistake to add to tests.

  2. Pat Maddox Pat Maddox on October 28, 2008 at 07:16AM

    Pat #2 tags in

    I've been thinking about this off and on for a while. Tonight I finally came up with a way I like:

    [1,2,3].should == unordered(3,1,2)
    

    What do you guys think? It's just shoulda's assert_same_elements. Code is at http://github.com/pat-maddox/rspec/tree/unordered

  3. Pat Maddox Pat Maddox on October 28, 2008 at 07:36AM

    Granted, ActiveRecord objects can lead to some gnarly test failure messages, but that should be no reason to change how you write your tests

    I don't agree with that. Steve makes a very interesting point about slightly sacrificing code clarity in favor of clarity in the failure messages. I think it should be taken a bit further though. Instead of choosing between gnarly failure messages and noisy examples, you should improve the given failure messages, or build a clear abstraction over the ugly code that provides pretty failures. Improving the generated messages is more robust, but also more work, because you'd have to create special formatters and handle several conditions. Abstracting the existing mechanism is simple to implement, but because the underlying code was not written with this abstraction in mind, it can evolve independently and no longer be sufficient for the abstracted expression. But I'd say that for stuff like RSpec's operator matchers, which are stable, it's an efficient technique for improving the test over all.

  4. Brian Takita Brian Takita on October 28, 2008 at 01:38PM

    @Pat - Aww, you beat me to it. On Casebook, we added the include_all matcher.

    [1,2,3].should include_all(3,1,2)
    

    The code is not available publicly, at this time. We also made the error message more explicit.

  5. Karl Karl on October 28, 2008 at 04:28PM

    You are missing the even more controversial:

    Person.short_people.collect{|p|p.reason_to_live}.should == []

    Sorry, I just couldn't resist. Gotta love Randy Newman.

  6. Pat Maddox Pat Maddox on October 28, 2008 at 04:40PM

    @Brian - The thing I don't like about that is that I would expect

    [1,2,3,1].should include_all(3,1,2)
    

    to be true. But that's not the behavior we're going for in this case. unordered is the best I've been able to come up with, but David doesn't seem to like the name.

  7. Steve Steve on October 28, 2008 at 06:02PM

    "The reason I'm so adamantly against using collect in testing collections is that it obfuscates the intent of the test. If you map by first_name, for example, then how is the reader supposed to know that first_name has nothing to do with what you're testing. How is the reader supposed to know that it's irrelevant to the intent of the test? You'll end up using more brain cycles trying to figure out why the collection is being mapped. Mapping the result is irrelevant, and can even lead to red herrings."

    You're right Pat, we fundamentally disagree.

    I expect you to look at a collect to first_name and think "hmm, all we care about is that Linda, Dwane, and Rick are in the array". Or at least, after having had a conversation like this, you would find it unsurprising and maybe even reasonable.

    Or to put it another way, I claim that tests that involves needlessly complex objects, when primitives will do, are less optimal because they turn the reader's attention away from the important point (in this case, the content of the array). I wouldn't claim that using full objects is in any way bad, or that (god forbid) what I'm proposing is a "best practice". Just that overall I find the benefits to exceed the cost of the collect statement. You may not.

    Also, the problem with overriding inspect is that - especially with model objects - what you want to see on inspect is too context-dependent. Consider an import test where I care about how each attribute of the model turns out, vs something like the above.

    "Granted, ActiveRecord objects can lead to some gnarly test failure messages, but that should be no reason to change how you write your tests."

    I can think of few better reasons to change how I write my tests. Surely, even for you, when you evaluate how you will express something in the form of a test, the setup code, assertions, and error messages are all important factors, and part of a whole, where you try to get the best reading and execution experience by playing with all three. Right?

  8. Jeff Dean Jeff Dean on October 29, 2008 at 03:18AM

    I generally prefer custom matcher to mapping because I find tests with less punctuation to be easier to read.

    include_all's main intent is to give you a very detailed error message, including what's missing and what's there but shouldn't be. It prints each member of the array on a separate line, so it's easy to quickly identify the offending entries without having to horizontally scroll through the output.

    @pat maddox: I agree that include_all is not the best name. It's worked very well at allowing us to quickly identify the differences in arrays of complex objects.

    I think to make array matchers better, we could:

    • Have the output of our matchers look for an rspec_inspect method that, if defined on that AR class from a spec_helper, would show you that result (so you could define rspec_inspect to return :first_name for people, if you wanted but keep it in one place and out of your code)
    • Add 2 new matchers, one that asserts that it contains exactly the same elements (unordered), and one that asserts that it contains at least some elements

    It might look like this:

    # in some spec helper
    Person.class_eval { def rspec_inspect; first_name; end }
    
    # the matchers
    Person.tall_people.should contain_exactly(
      people(:linda), 
      people(:dwane), 
      people(:rick)
    )
    
    Person.tall_people.should contain_at_least(
      people(:linda), 
      people(:rick)
    )
    
    # the output
    expected 
      ['Linda', 
       'Dwane'] 
    to include 
      ['Darrel']
    not to include 
      ['Rick']
    

    Pat Maddox's matcher file can be found here, and I posted the include_all matcher Brian referred to as a gist if you want to take a look.

  9. Adam Milligan Adam Milligan on October 30, 2008 at 05:49AM

    I have to say I agree pretty strongly with Pat N on this. Not only does the collect change what the test communicates, but it also actually tests the wrong comparisons. Objects define their own meaning for comparison, which we should respect.

    Person.tall_people should return a collection containing Person objects. To verify that this contains the appropriate people, we should compare to an expected list of Person objects, using Person#== (or, perversely, Person#eql?). Comparing attributes of the expected objects, which have a different type and therefore potentially different comparison semantics, opens the door for incongruities.

    This is a subtle point, and one not likely to come up, but I think it mirrors the conceptual disconnect that this example creates.

    All of this assumes, of course, that you can guarantee that Person#first_name returns (and will always return) a unique and non-nil value for each individual Person instance.

  10. Jeff Dean Jeff Dean on November 03, 2008 at 03:23AM

    Hopefully better array matching will be a part of rspec soon - you can follow the progress at the lighthouse ticket

  11. Jim Kingdon Jim Kingdon on November 03, 2008 at 03:21PM

    Adam: my biggest problem with relying on == (or equals in java, or similar mechanisms) is that many applications have at least two interesting notions of object "equality". (When working on mayfly this is particularly amusing as one of the kinds of equality, SQL =, isn't even reflexive).

    Trying to have inspect (or toString in java, or the like) do the right thing in all contexts has somewhat similar problems, although perhaps not as severe (especially in ruby which has both to_s and inspect).

    The extra collect doesn't really bother me, but I guess if there was another way of getting good failure messages I suppose I could live with that.

  12. Adam Milligan Adam Milligan on November 06, 2008 at 02:12PM

    @Jim: Inconsistent object equality is a more fundamental problem than good or bad test failure messages; it's a problem with your domain model. Solve the problem, not the symptom. I realize there are, as in all things, instances in which you just have to live with suboptimal code, but basing a general principle specific instances of brokenness leads to madness.

    However, on a vaguely related note, this brings up one of the, in my opinion, major flaws in Ruby: the difference between #equal?, #eql?, and #==. Sure, they're spelled differently, but they're all pronounced the same; two mean object equality, one means pointer equality (yes, Ruby uses pointer semantics, no matter what you call it). How many people know which is which?

  13. Steve Conover Steve Conover on November 06, 2008 at 09:38PM

    "Not only does the collect change what the test communicates, but it also actually tests the wrong comparisons. Objects define their own meaning for comparison, which we should respect."

    I will disagree head-on with this statement. I don't accept that a method we've come to call "equals" or "==" or whatever is superior to "first_name" in this situation.

    I care about establishing that the identity of the object is the same, and I want my failure messages to read nicely. I don't want all equality to be based on first_name for obvious reasons, but local to this test I can assume first_name's are unique.

    Or to put it another way, I wholeheartedly accept critiques on style (as many commenters have put forward). Critiques based on some presumed deeper normative rules are just taste preferences in disguise. There's nothing about the effectiveness of the test that changes by using object equality. (I actually think this is an important point. A better test does not over-prove: it proves the intended outcomes as precisely as possible and tries not to fail for reasons unimportant in that context).

  14. Adam Milligan Adam Milligan on November 11, 2008 at 11:15PM

    Good tests don't over-prove, but nor do they under-prove. This code:

    Person.tall_people.collect{|p|p.first_name}.should == ["Linda"]
    

    is meaningless if your database contains ten people with the first name Linda, only one of which is tall. Each Person object is distinct, and will describe itself as distinct from others when asked appropriately, but the first_name comparison will fail to draw that distinction.

    This may seem over-cautious, but who hasn't had fixture data change under their feet?

  15. Pat Nakajima Pat Nakajima on November 12, 2008 at 08:37AM

    Just for fun, a ridiculously overreaching yet minimally applicable solution:

    class Person
      class << self
        attr_accessor :tall_people
      end
    
      self.tall_people = []
    
      attr_reader :name
    
      def initialize(name=nil)
        @name = name
      end
    end
    
    require 'rubygems'
    require 'spec'
    
    describe Person do
      attr_reader :timmy, :linda
    
      before(:all) do
        Object.class_eval do
          def linda?
            name == 'Linda' rescue false
          end
    
          def has_linda?
            any? { |person| person.linda? }
          end
        end
      end
    
      before(:each) do
        @timmy = Person.new("Pat")
        @linda = Person.new("Linda")
      end
    
      it "knows if person is linda" do
        timmy.should_not be_linda
        linda.should be_linda
      end
    
      it "knows if linda is in collection" do
        Person.tall_people.should_not have_linda
        Person.tall_people << linda
        Person.tall_people.should have_linda
      end
    end
    

Add a Comment (MarkDown available)