2009-11-11

enumerabler.rb: Enumerable lazy-version method library

When you want to gather first ten even numbers from an integer array, what do you do?

ary.select {|x| x.even? }.take(10)

The above program is very simple. But it may cause a lot of unnecessary checks when array is big. Also, `select' creates intermediate array.

ret = []
ary.each do |x|
  ret << x if x.even?
  break if ret.size == 10
end

This will stop the loop when enough even numbers is found, and create no unnecessary array. But, it has trouble with readability and maintainability because it is very dirty, cumbersome, tricky, too long and too primitive. We live in the age of Ruby 1.9. We need more elegance.

"We can use Enumerator and `to_enum' in Ruby 1.9!"

It's a good idea, but you are unfortunate.

(1..50).to_enum(:select) {|x| x.even? }.take(10)
  #=> [1,2,3,4,5,6,7,8,9,10]  # never selected!

Please think yourself the reason why it does not work. `to_enum' is too difficult for me, to explain in English :-p

No matter how you combine `to_enum' and `select', it will never work, I guess. You can (must) use Enumerator.new without using Enumearble#select:

module Enumerable
  def selector(&blk)
    Enumerator.new do |e|
      each do |x|
        e << x if blk[x]
      end
    end
  end
end

Also troublesome...

proposed `enumerabler.rb'

I wrote a library, named enumerabler.rb, which is a collection of the above method definitions.

require "enumerabler"

ary.selector {|x| x.even? }.take(10)

You use `selector' instead of `select.' While `select' returns an array, `selector' returns an Enumerator corresponding the array. Because of Enumerator, the evaluation is delayed. So, the above program using `selector' is not only simple but also effective, which will stop its loop as soon as ten even numbers is found. No intermediate array.

enumerabler.rb defines:

  • Enumerable#grepper
  • Enumerable#finder_all (alias: all_finder, selector)
  • Enumerable#rejecter
  • Enumerable#collector (alias: mapper)
  • Enumerable#flat_mapper (alias: concat_collector)
  • Enumerable#zipper
  • Enumerable#taker
  • Enumerable#taker_while
  • Enumerable#dropper
  • Enumerable#dropper_while

All of them returns corresponding Enumerator. In an aside, I cannot distinguish -er and -or without a dictionary.

"Does it use a Fiber? Isn't it slow?"

No, enumerabler.rb does neither generate nor yield a Fiber. Trust me. If you worry, please read and check the source code of Ruby!

However, it is actually slow because it calls Proc frequently. `selector' is about two or three times slower than the original `select', but it is faster than a Fiber, and actually reduces memory usage.

Examples

p ary.selector {|x| x.even? }.taker(100).take_while {|x| x >= 0 }

gathers first 100 even numbers, however terminates before the negative element, without unnecessary check and intermediate array.

require "enumerabler"

def sieve(e, &blk)
  yield n = e.first
  sieve(e.rejecter {|x| x % n == 0 }, &blk)
end

sieve(2..(1.0/0.0)) {|x| p x } #=> 2, 3, 5, 7, 11, 13, 17, 19, 23, ...

Sieve of Eratosthenes using `rejecter.'

require "enumerabler"

def fibonacci
  Enumerator.new do |e|
    e << 1
    e << 1
    fibonacci.zip(fibonacci.dropper(1)) {|x, y| e << x + y }
  end
end

fibonacci.each {|x| p x } #=> 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...

Fibonacci numbers. `dropper' can drop an infinite sequence. Note that this is very slow because of call-by-name evaluation and because `zip' internally uses a Fiber.

how to install

$ gem install enumerabler

http://github.com/mame/enumerabler

translated from mamememo in Japanese (2009-11-11).