On Wed, Jan 31, 2007 at 06:37:14PM +0900, Shot (Piotr S.) wrote:

Eric H.:

You may want to define #eql? in terms of #to_a as well…

Thanks, but (considering Mauricio F.â€™s fix) what would be the

benefit of defining #eql? in terms of Set#sort instead of Set#==?

“None”. You cannot just

def eql?(o); sort == o.sort end

either, since there’s no guarantee there’s a total order associated to

the

set (it seems my reply to Pit where I pointed this out was dropped too):

require ‘set’

Set.new([‘a’, 1]).sort # =>

# ~> -:2:in `sort’: comparison of String with 1 failed (ArgumentError)

# ~> from -:2

This is why the hash implementation I gave you does

map{|x| x.hash}.sort.hash # sorting hash values, i.e. Integers

instead of

sort.map{|x| x.hash}.hash

Even when the elements can be ordered, sort would be O(n ln n) vs. #=='s

O(n)

Of course, the former is a “fast NlnN”, being written in C, so it would

seem

that it can keep the speed advantage up to large values of N.

However, #sort will process the whole array before it can start

comparing

values, and #== can begin right away. If there are many differing

elements,

the probability of finding one of them in the first comparisons is very

high.

Somewhat surprisingly, #== can beat #sort even in the most favorable

case for

the latter (only one differing element, the first in the sorted array

but not

in Hash#to_a, small sets):

require ‘benchmark’

require ‘set’

size = 64

elm = -10

elm2 = -2

s1 = Set.new((0…size).to_a << -1)

s2 = Set.new((0…size).to_a << elm)

GC.disable # don’t let this interfere

def bm(s1, s2, elm, iterations = 5000)

puts “#== will stop after #{1 + s2.to_a.index(elm)} .include?

tests”

Benchmark.bmbm(10) do |bm|

bm.report(“Set#sort”) { iterations.times{ s1.sort == s2.sort } }

bm.report(“Set#==”) { iterations.times{ s1 == s2 } }

end

end

bm(s1, s2, elm)

puts “-” * 40

bm(s1, Set.new((0…size).to_a << elm2), elm2)

# >> #== will stop after 43 .include? tests

# >> Rehearsal ---------------------------------------------

# >> Set#sort 0.700000 0.050000 0.750000 ( 0.768818)

# >> Set#== 0.370000 0.010000 0.380000 ( 0.381117)

# >> ------------------------------------ total: 1.130000sec

# >>

# >> user system total real

# >> Set#sort 0.680000 0.060000 0.740000 ( 0.736052)

# >> Set#== 0.380000 0.010000 0.390000 ( 0.380995)

# >> ----------------------------------------

# >> #== will stop after 7 .include? tests

# >> Rehearsal ---------------------------------------------

# >> Set#sort 0.620000 0.010000 0.630000 ( 0.637358)

# >> Set#== 0.170000 0.000000 0.170000 ( 0.175028)

# >> ------------------------------------ total: 0.800000sec

# >>

# >> user system total real

# >> Set#sort 0.750000 0.050000 0.800000 ( 0.807318)

# >> Set#== 0.170000 0.010000 0.180000 ( 0.176633)