each_with_object vs. reduce/inject

calendar_today 

At first glance, each_with_object and reduce/inject may appear somewhat redundant, but once you embrace the difference, you'll condense and clean up your code. While reduce and inject are literally aliases for the same function, each_with_object is it's own beast and serves a different purpose. This article is a explanations of both methods, and when to use each.

TL;DR

prefer to use:

  • reduce: use when returning simple objects such as Strings and Numbers
  • each_with_object: use when returning complex objects such as Hashes and Arrays

Differences:

  1. the block variables are reversed

    (1..10).reduce({}) { |obj, num| obj[num.to_s] = num; obj }
    (1..10).each_with_object({}) { |num, obj| obj[num.to_s] = num }
  2. source of the reduction value
    reduce - gets reduction value from the return value of the previous loop
    each_with_object - tracks and supplies the object automatically

    (1..10).reduce({}) { |obj, num| obj[num.to_s] = num; obj }
    # the hash is explicitly returned and assigned to obj for the next loop
    (1..10).each_with_object({}) { |num, obj| obj[num.to_s] = num }
    # hash is updated and supplied to the next loop.

Gotchas:

  • reduce: the last excuted line of the block must return the reduction value for the next loop.

Using reduce/inject:

It's important to note that reduce and inject are aliases for the same method. They both iterate over all the values in an enumerable object and return a new object. I prefer calling it reduce, but if you love inject, go for it. Let's look at a code example we could easily optimize with reduce.

total = 0
(1..100).each do |num|
  total += num
end
# total = 5050

The above code is fine, but using reduce, we can do the following.

total = (1..100).reduce(:+) # total = 5050

When I first learned Ruby, details like this blew my mind. reduce can take a method in symbol form as an argument and call it on each item it the list to reduce it. We're adding up the items, so we turn the + method into a symbol via the added colon :. The above code is shorthand for the following.

total = (1..100).reduce do |total, num|
  total + num # calls the + method on total; total+(num)
end # total = 5050

Also, reduce must have a starting point, If you don't give it one, it defaults to 0. The following two snippets have identical results.

sum = (1..100).reduce(0, :+)
sum = (1..100).reduce(:+)

Here's another example. We are zipping together two arrays to make a hash. If you're not familiar with zip, it creates a new array by pairing matched indicies of two arrays into sub-arrays. So [1,2,3].zip(["a","b","c"]) returns [[1,'a'],[2,'b'],[3,'c']]. Let's look at the example.

kids_ages = {} # we want this to be {"Steve" => 14, "John"=> 12 ...}
kids = ["Steve", "John", "Kim", "Gloria", "Sam"]
ages = [14, 12, 2, 23, 4]
kids.zip(ages).each do |pair|
  kids_hash[pair[0]] = pair[1]
end

let's use reduce to streamline our code

kids = ["Steve", "John", "Kim", "Gloria", "Sam"]
ages = [14, 12, 2, 23, 4]
kids_ages = kids.zip(ages).reduce({}) do |hsh, pair|
  hsh[pair[0]] = pair[1]
  hsh # why is this here?
end

That's an improvement, however, notice that we explicitly return hsh to access it on the next loop. Why did we have to do that here and not in the first example? Let's extract and simplify the internal code from each examples' reduce block to investigate.

total + num # block from example 1
hsh["Steve"] = 14 # block from example 2

Assuming the needed variables are all defined, here are the return values

total + num # block from example 1
=> 3 # returns the sum, which is what we need for the next loop
hsh["Steve"] = 14 # block from example 2
=> 14 # returns the added value, does NOT return the needed hash

Adding values to hashes in Ruby returns the added value, not the object. If the last line of your reduce block inserts a value, you'll accidentally assign that value to your reducer variable. For example

kids = ["Steve", "John", "Kim", "Gloria", "Sam"]
ages = [14, 12, 2, 23, 4]
names.zip(ages).reduce({}) do |hsh, pair|
  hsh[pair[0]]  = pair[1] # no explicit return of hsh
end
=> NoMethodError: undefined method `[]=' for 22:Integer.
   Did you mean?  []

The first loop through the array will return 14. So, on our second loop hsh = 14 and our block attempts to perform the following

14["John"] = 12 # causes the undefined method error for []=

Let's see how each_with_object can clean up and clarify the code.

Using each_with_object:

each_with_object was designed to gracefully handle the specific problem of working with complex objects in reduce. It iterates over the items in an enumerable object, with a new object you supply, and returns the supplied object. Using each_with_object with our previous example, we get the following code.

kids = ["steve", "john", "kim", "gloria", "sam"]
ages = [14, 12, 2, 23, 4]
kids_ages = kids.zip(ages).each_with_object({}) do |pair, hsh| # {} assigned to hsh
  hsh[pair[0]] = pair[1] # hsh automatically accessed in loops
end

Using each_with_object instead of reduce give two improvements to our code. First, we can drop the explicit return value, because each_with_object tracks it for us. Think of each_with_object as a smarter version of reduce with a persistent memory of the reduce value. A second minor improvement is a more logical ordering of the block variables. The method itself states the order. It's each_with_object, not object_with_each so we know the variable for the object being enumerated goes first, and the object variable goes second.

The lack of an explicit return in each_with_object may seem like a minor improvement, but a common error (in multiple different languages) is not giving reduce an explicit return value when handling complex reducer values. I've made this error more times than I care to remember in JavaScript (not hating, just a statement). In Ruby, I have my trusty friend each_with_object to handle it.

Which to use?

If each_with_object is so awesome, when should we use reduce? While almost everything in Ruby is an object, I reserve using reduce for the simpler ones, such as String and Integer. If I'm using an Array or Hash, I use each_with_object. While each_with_object gets a great deal more use in my code, reduce is a powerful tool not to be forgotten. Anything you implement with each_with_object can also be implemented with reduce, but using each_with_object at certain times can reduce errors and clarify code.