Every ruby enumerator supports count
. It’s a method that will iterate over every item and return their total count.
irb> enum = Enumerator.new { |yielder|
(1..100).each do |i|
puts "counting item: #{i}"
yielder << i
end
}
irb> enum.count
counting item: 1
counting item: 2
…
counting item: 100
=> 100
However, Enumerable also has size
. Except, by default it’s just nil
.
irb> enum.size
=> nil
A little-known feature in ruby is that you can pass a parameter to Enumerator.new
to give it a shortcut “answer” to the size question.
irb> enum = Enumerator.new(100) { |yielder|
(1..100).each do |i|
puts "counting item: #{i}"
yielder << i
end
}
irb> enum.size
=> 100
No more iterating to get the count. However, there’s an even more little-known feature. You can pass a lambda to determine the size lazily, and still faster than iterating. Let’s say that you’re enumerating over products in some kind of ecommerce API.
irb> api = EcommerceApi.new('connection config')
irb> enum = Enumerator.new { |yielder|
api.products.each.with_index do |product, index|
puts "fetching product: #{index}"
yielder << product
end
}
irb> enum.count
fetching product 0
fetching product 1
…
fetching product 235
=> 236
Let’s say our API has a more efficent way of obtaining the count: total_count
endpoint.
irb> api = EcommerceApi.new('connection config')
irb> enum = Enumerator.new(api.products.total_count) { |yielder|
api.products.each.with_index do |product, index|
puts "fetching product: #{index}"
yielder << product
end
}
irb> enum.size
=> 236
We no longer have to iterate over products to get the total count, but notice a new problem: we now always run total_count
, even if the user of our enum
never calls size
. Seems like a waste. Moreover, if the products are added to the API, our size will not change. The lambda would allow us to run the API call only when requested, and always get fresh count.
irb> api = EcommerceApi.new('connection config')
irb> enum = Enumerator.new(-> { api.products.total_count }) { |yielder|
api.products.each.with_index do |product, index|
puts "fetching product: #{index}"
yielder << product
end
}
irb> enum.size # Calls -> { api.products.total_count } lambda.
=> 236
This feature also exists when using enum_for
/to_enum
to create the enumerator. You have to return it from the block passed into enum_for
. The block arguments are any additional arguments passed to enum_for
.
irb> def each_number(max = 100)
return enum_for(__method__, max) { |max| max } unless block_given?
(1..max).each { |n| yield n }
end
irb> each_number(200).size
=> 200
P.S. I often forget how this Ruby feature works, and searching never brings up quick examples, so hopefully this article will help when in need of a quick reminder.