On a few different Ruby projects, I have seen an antipattern emerge involving nested hashes.
For example, consider this simple question-and-answer command-line script:
QUIZ = {
question_1: {
prompt: "What is your name?"
},
question_2: {
prompt: "What is your favorite color?"
}
}
answers = {}
QUIZ.each do |name, question_hash|
print question_hash[:prompt], " "
answers[name] = gets.chomp
end
p answers
The QUIZ constant isn’t too hard to understand. The keys (question_1 and question_2) each point to a value that is itself another hash. Let’s call that a question_hash. For now a question_hash has only one key-value pair, the prompt.
The answers hash simply accumulates answers. The key is the key from the QUIZ constant, and the value is the answer the user typed in.
At the end of the script, the program calls Kernel#p to print out the inspect string for the answers hash.
Sure enough, the program works.
$ ruby quiz.rb
What is your name? Grant
What is your favorite color? orange
{:question_1=>"Grant", :question_2=>"orange"}Take a look at the QUIZ.each block up there. Notice that the inside of the block assumes that the question_hash will have a :prompt key.
That’s not too complicated. But let’s add some more features to our script. Now we will add a multiple-choice question to the QUIZ hash:
QUIZ = {
question_1: {
prompt: "What is your name?"
},
question_2: {
prompt: "What is your favorite color?"
},
question_3: {
prompt: "What is your favorite integer between 1 and 5?",
choices: %w[1 2 3 4 5]
}
}
Now we want to treat a question_hash differently if it has a choices key. One way to do this is to write two different methods that take the question_hash as an argument and ask the question. The methods will return the answer that the user inputs.
def simple_answer(question_hash)
print question_hash[:prompt], " "
gets.chomp
end
def multiple_choice_answer(question_hash)
choices = question_hash[:choices]
answer = nil
until choices.include?(answer)
print question_hash[:prompt], " "
answer = gets.chomp
puts "You must pick an answer from #{choices}" unless choices.include?(answer)
end
answer
end
Now we simply update our block to call the appropriate method for each question_hash.
QUIZ.each do |name, question_hash|
answer = if question_hash[:choices]
multiple_choice_answer(question_hash)
else
simple_answer(question_hash)
end
answers[name] = answer
end
p answers
This all still works fine.
$ ruby quiz.rb
What is your name? Grant
What is your favorite color? orange
What is your favorite integer between 1 and 5? 7
You must pick an answer from ["1", "2", "3", "4", "5"]
What is your favorite integer between 1 and 5? 3
{:question_1=>"Grant", :question_2=>"orange", :question_3=>"3"}But the code is very quickly getting more complex. Before long, we will have questions with sub-questions, conditional follow-up questions, and even more complicated features. Here’s an example that supports a Proc for formatting the user’s answer.
QUIZ = {
question_1: {
prompt: "What is your name?"
},
question_2: {
prompt: "What is your favorite color?",
display_proc: ->(color) { "The color is #{color}." }
},
question_3: {
prompt: "What is your favorite integer between 1 and 5?",
choices: %w[1 2 3 4 5]
}
}
answers = {}
def simple_answer(question_hash)
print question_hash[:prompt], " "
gets.chomp
end
def multiple_choice_answer(question_hash)
choices = question_hash[:choices]
answer = nil
until choices.include?(answer)
print question_hash[:prompt], " "
answer = gets.chomp
puts "You must pick an answer from #{choices}" unless choices.include?(answer)
end
answer
end
QUIZ.each do |name, question_hash|
answer = if question_hash[:choices]
multiple_choice_answer(question_hash)
else
simple_answer(question_hash)
end
answers[name] = answer
end
display_answers = Hash[
answers.map do |name, answer|
if display_proc = QUIZ[name][:display_proc]
[name, display_proc[answer]]
else
[name, answer]
end
end
]
p display_answers
The output now looks like this
$ ruby quiz.rb
What is your name? Grant
What is your favorite color? orange
What is your favorite integer between 1 and 5? 6
You must pick an answer from ["1", "2", "3", "4", "5"]
What is your favorite integer between 1 and 5? 3
{:question_1=>"Grant", :question_2=>"The color is orange.", :question_3=>"3"}And the code will keep growing and growing, making it harder for the next developer to understand exactly how the big Hash will translate into the user’s experience.
Take a look at those answer methods. Whenever I see a bunch of methods that accept the same argument, I suspect that there is an object missing in my system. Let’s create a new Question object. Imagine that we have it in that QUIZ.each block, and that we can simply ask the Question for its answer directly.
class Question
def initialize(question_hash)
@question_hash = question_hash
end
def answer
if @question_hash[:choices]
multiple_choice_answer(@question_hash)
else
simple_answer(@question_hash)
end
end
private
def simple_answer(question_hash)
print question_hash[:prompt], " "
gets.chomp
end
def multiple_choice_answer(question_hash)
choices = question_hash[:choices]
answer = nil
until choices.include?(answer)
print question_hash[:prompt], " "
answer = gets.chomp
puts "You must pick an answer from #{choices}" unless choices.include?(answer)
end
answer
end
end
QUIZ.each do |name, question_hash|
question = Question.new(question_hash)
answers[name] = question.answer
end
Notice that simple_answer and multiple_choice_answer are now completely private. That’s good, because it hides away the internal details of how a Question works. We are now free to start doing some refactoring, without worrying about the rest of the code in the system breaking.
Our first refactor will be to remove the arguments from the private methods, since it’s now possible to get at the @question_hash directly from within the Question object.
class Question
def initialize(question_hash)
@question_hash = question_hash
end
def answer
if @question_hash[:choices]
multiple_choice_answer
else
simple_answer
end
end
private
def simple_answer
print @question_hash[:prompt], " "
gets.chomp
end
def multiple_choice_answer
choices = @question_hash[:choices]
answer = nil
until choices.include?(answer)
print @question_hash[:prompt], " "
answer = gets.chomp
puts "You must pick an answer from #{choices}" unless choices.include?(answer)
end
answer
end
end
Next, notice that the only reason we are passing the question_hash into the Question is to get out the prompt and the choices. When a constructor takes a hash with expected keys, you can swap the hash out for multiple arguments instead.
class Question
def initialize(prompt, choices)
@prompt = prompt
@choices = choices
end
def answer
if @choices
multiple_choice_answer
else
simple_answer
end
end
private
def simple_answer
print @prompt, " "
gets.chomp
end
def multiple_choice_answer
answer = nil
until @choices.include?(answer)
print @prompt, " "
answer = gets.chomp
puts "You must pick an answer from #{@choices}" unless @choices.include?(answer)
end
answer
end
end
Now the object is easier to understand. But there’s still a problem. To use this new object, the each block needs to be updated:
QUIZ.each do |name, question_hash|
prompt = question_hash[:prompt]
choices = question_hash[:choices]
question = Question.new(prompt, choices)
answers[name] = question.answer
end
This is uglier than what we had before. But there’s a quick fix that makes our code much nicer. Let’s change the QUIZ hash so that its values are already Question objects:
QUIZ = {
question_1: Question.new("What is your name?", nil),
question_2: Question.new("What is your favorite color?", nil),
question_3: Question.new("What is your favorite integer between 1 and 5?",
%w[1 2 3 4 5])
}
This makes our each block much nicer!
QUIZ.each do |name, question|
answers[name] = question.answer
end
But there’s still one problem. Our QUIZ no longer holds the display_proc, so this code from before fails:
display_answers = Hash[
answers.map do |name, answer|
if display_proc = QUIZ[name][:display_proc]
[name, display_proc[answer]]
else
[name, answer]
end
end
]
So let’s put that proc into the Question object and make it available via an attr_reader. While we’re at it, we can give default values of nil for both choices and nil.
class Question
attr_reader :display_proc
def initialize(prompt, choices = nil, display_proc = nil)
@prompt = prompt
@choices = choices
@display_proc = display_proc
end
# ...
Now our QUIZ looks like this:
QUIZ = {
question_1: Question.new("What is your name?"),
question_2: Question.new("What is your favorite color?",
nil,
->(color) { "The color is #{color}." }),
question_3: Question.new("What is your favorite integer between 1 and 5?",
%w[1 2 3 4 5])
}
And our display_answers code looks like this:
display_answers = Hash[
answers.map do |name, answer|
if display_proc = QUIZ[name].display_proc
[name, display_proc[answer]]
else
[name, answer]
end
end
]
Now let’s move the logic for deciding how an answer should be formatted into the Question object. (We can remove the attr_reader for display_proc at this time as well, to keep things nicely hidden away.)
class Question
# ...
def display_answer(answer)
if @display_proc
@display_proc[answer]
else
answer
end
end
# ...
Which gives us:
display_answers = Hash[
answers.map do |name, answer|
question = QUIZ[name]
[name, question.display_answer(answer)]
end
]
At this point, it seems like the question name could also easily be put into the Question object as a new first argument. Let’s replace the QUIZ constant with a QUESTIONS constant that holds an array of Question objects.
QUESTIONS = [
Question.new(:question_1,
"What is your name?"),
Question.new(:question_2,
"What is your favorite color?",
nil,
->(color) { "The color is #{color}." }),
Question.new(:question_3,
"What is your favorite integer between 1 and 5?",
%w[1 2 3 4 5])
]
answers = {}
QUESTIONS.each do |question|
answers[question.name] = question.answer
end
display_answers = Hash[
QUESTIONS.map do |question|
answer = answers[question.name]
[question.name, question.display_answer(answer)]
end
]
p display_answers
Now the code is much cleaner. There are definitely more places the code could be refactored. choices and display_proc could be moved into an options hash argument since they are truly optional. Maybe there could be a SimpleQuestion superclass and a subclass called MultipleChoiceQuestion with all the choices-related code.
The most important thing to realize is that now the conversation that future developers might have about this code has shifted.
Before, the conversation would probably have centered around how complicated the code looked and confusion about how the structure of the QUIZ hash affects the control flow of the rest of the program. What is the easiest way to make a small change to add a new feature without breaking the whole big complex structure?
Now, the conversation can be about whether the current object model makes sense or should be tweaked in some small way. And with the full power of a Ruby class instead of the limited behavior of a Hash, the developers will have more options.
This code example is available on GitHub with a full Git history of the working code at each stage. Here’s the final code:
class Question
attr_reader :name
def initialize(name, prompt, choices = nil, display_proc = nil)
@name = name
@prompt = prompt
@choices = choices
@display_proc = display_proc
end
def answer
if @choices
multiple_choice_answer
else
simple_answer
end
end
def display_answer(answer)
if @display_proc
@display_proc[answer]
else
answer
end
end
private
def simple_answer
print @prompt, " "
gets.chomp
end
def multiple_choice_answer
answer = nil
until @choices.include?(answer)
print @prompt, " "
answer = gets.chomp
puts "You must pick an answer from #{@choices}" unless @choices.include?(answer)
end
answer
end
end
QUESTIONS = [
Question.new(:question_1,
"What is your name?"),
Question.new(:question_2,
"What is your favorite color?",
nil,
->(color) { "The color is #{color}." }),
Question.new(:question_3,
"What is your favorite integer between 1 and 5?",
%w[1 2 3 4 5])
]
answers = {}
QUESTIONS.each do |question|
answers[question.name] = question.answer
end
display_answers = Hash[
QUESTIONS.map do |question|
answer = answers[question.name]
[question.name, question.display_answer(answer)]
end
]
p display_answers
Nice post, this is a problem I’ve stumbled into myself. It gets particularly awkward when you start loading these things from YAML files. Suddenly you have to fit your model, however complex, on a hash-like structure, and then code your way out of the deserialization problem.
If creating questions is a task that is done multiple times, I would also consider an `instance_eval`-ing DSL for that task. Ruby lets us create DSLs that look almost natural language, to the point that it should be feasible to a non-technical team member to use them directly.
January 31, 2013 at 9:52 am
Great explanation!
January 31, 2013 at 10:03 am
It seems you are solving a problem that doesn’t exist, and I’m surprised PL have published this. What kind of mindless developer throws such complexity into a hash, when working with a fully OBJECT-oriented language such as Ruby?
Creating an object is the first thing you’d do.
February 1, 2013 at 1:22 pm
David, oftentimes you will work on teams with developers who are not as experienced with good object design principles. When a team or codebase is large, it may be tempting for even experienced developers to cut corners and bring on technical debt so that a feature makes it out the door more quickly. So believe me, in my years of working on legacy Rails applications, I have run into this antipattern many times, always from well-meaning teams. I’ll even admit to having written a few of them. The most important thing is to recognize when it’s happening, and to refactor the problem away as soon as it starts developing, because less-experienced developers will tend to follow what they see and cause the Hash to grow.
February 2, 2013 at 8:17 am
This is an excellent deconstruction of an insidious anti-pattern.
Interestingly enough, I too have come to a similar conclusion, but I found myself wanting a more flexible way of creating the attribute variables on the object.
Enter Ostend: https://github.com/jgandt/ostend
Everything is in the documentation, but basically I wanted a hash deconstrution/Object translation tool.
Could this gem helpful? Or would it be harmful to your deconstruction above?
I’m looking for feedback in general too. Let me know if you have any thoughts.
Feel free to call the idea not-smart/bad ruby…
.jpg
February 4, 2013 at 11:51 pm