Re: map_if, collect_if?


#1

module Enumerable
def map_if( &block )
find_all(&block).map(&block)
end
end
[…]
I will file away the find_all {}.map {} idiom for future use
[…]

I would change that to:

map(&block).select{|e|e}

there can be surprising side effects if the user doesn’t
expect the block to be called twice for each element.

cheers

Simon


#2

On 4/13/06, Kroeger, Simon (ext) removed_email_address@domain.invalid wrote:

I would change that to:

map(&block).select{|e|e}

there can be surprising side effects if the user doesn’t
expect the block to be called twice for each element.

cheers

Simon

Good point.
This block twice thing felt elegant and wrong in the same place.
I thaught that perfomance should not be a design issue, but Bruce’s
point
was well taken.
The side effect thing might not strike often, but if it strikes, boy I
would
not want to debug that baby.

Nice discussion.

Robert


Deux choses sont infinies : l’univers et la bêtise humaine ; en ce qui
concerne l’univers, je n’en ai pas acquis la certitude absolue.

  • Albert Einstein

#3

On 4/13/06, Robert D. removed_email_address@domain.invalid wrote:

Ooops

module Enumerable
def map_if
map{ |x| (r = yield(x)) ? r : nil}.compact
end
end

that one might do as a comprimise between speed, elegance and sanity?


Deux choses sont infinies : l’univers et la bêtise humaine ; en ce qui
concerne l’univers, je n’en ai pas acquis la certitude absolue.

  • Albert Einstein

#4

On Thu, Apr 13, 2006 at 10:53:24PM +0900, Robert D. wrote:

On 4/13/06, Robert D. removed_email_address@domain.invalid wrote:
module Enumerable
def map_if
map{ |x| (r = yield(x)) ? r : nil}.compact
end
end

that one might do as a comprimise between speed, elegance and sanity?

It’s quite slow (two blocks in use…).

If you care a bit about performance (but not enough to use C),

def map_if(&b)
a = map(&b)
a.delete(false) # to get the same semantics as select{|e| e}
a.compact!
a
end

requires half as much mem in the worst case as, and runs ~60% faster
than
map(&b).select{|e| e}:

RUBY_VERSION # => “1.8.4”
module Enumerable
def map_if(&b)
a = map(&b)
a.delete(false) # to get the same semantics as select{|e| e}
a.compact!
a
end

def map_if2(&b)
map(&b).select{|e| e}
end

def map_if3
inject([]){|s,x| (v = yield(x)) ? s << v : s }
end

def map_if4
map{|x| (r = yield(x)) ? r : nil}.compact
end
end

require ‘benchmark’

TIMES = 100
Benchmark.bmbm(10) do |bm|
arr = (1…10000).to_a
bm.report(“compact!”){ TIMES.times{ arr.map_if{true} } }
bm.report(“select”){ TIMES.times{ arr.map_if2{ true} } }
bm.report(“inject”){ TIMES.times{ arr.map_if3{ true} } }
bm.report("? : test"){ TIMES.times{ arr.map_if4{ true} } }

the block wouldn’t let us measure the actual performance:

#bm.report(“compact!”){ TIMES.times{ arr.map_if{|x| x % 7 == 0 and x}
} }
#bm.report(“select”){ TIMES.times{ arr.map_if2{|x| x % 7 == 0 and x} }
}
#bm.report(“inject”){ TIMES.times{ arr.map_if3{|x| x % 7 == 0 and x} }
}
#bm.report("? : test"){ TIMES.times{ arr.map_if4{|x| x % 7 == 0 and x}
} }
end

>> Rehearsal ---------------------------------------------

>> compact! 0.380000 0.000000 0.380000 ( 0.431046)

>> select 0.650000 0.000000 0.650000 ( 0.725197)

>> inject 4.300000 0.020000 4.320000 ( 4.647794)

>> ? : test 2.400000 0.010000 2.410000 ( 2.551757)

>> ------------------------------------ total: 7.760000sec

>>

>> user system total real

>> compact! 0.390000 0.000000 0.390000 ( 0.408975)

>> select 0.640000 0.010000 0.650000 ( 0.747967)

>> inject 3.880000 0.010000 3.890000 ( 4.211830)

>> ? : test 2.070000 0.000000 2.070000 ( 2.199076)


#5

On 4/13/06, Mauricio F. removed_email_address@domain.invalid wrote:

end
map{|x| (r = yield(x)) ? r : nil}.compact
bm.report(“inject”){ TIMES.times{ arr.map_if3{ true} } }

>> inject 3.880000 0.010000 3.890000 ( 4.211830)

>> ? : test 2.070000 0.000000 2.070000 ( 2.199076)


Mauricio F. - http://eigenclass.org - singular Ruby

Very nice work, I just added my original plan to the benchmark, that one
with the side effects
def map_if5(&block)
find_all(&block).map(&block)
end
it is labeled as “2 yields” below. I spare you the rest of the code, I
just
stole from Mauricio, thx :wink:
and look at that, it is much faster than I thaught,

but the difference in the relative values also show that these BM are
not
really very accurate, I love them though
as they give some general ideas.

            user     system      total        real

compact! 0.330000 0.000000 0.330000 ( 0.328067)
select 0.510000 0.000000 0.510000 ( 0.525555)
inject 3.160000 0.020000 3.180000 ( 3.461044)
? : test 1.590000 0.020000 1.610000 ( 1.707898)
2 yields 0.880000 0.000000 0.880000 ( 0.897847)


Deux choses sont infinies : l’univers et la bêtise humaine ; en ce qui
concerne l’univers, je n’en ai pas acquis la certitude absolue.

  • Albert Einstein