Q: How to use non-builtin class as hash key?

Jia_Pu · September 1, 2006, 5:09am

I have a class defined as:

###########################
class MyClass
def initialize(s1, s2, s3)
@string1 = s1
@string2 = s2
@string3 = s3
end

def hash
    return (@string1 + @string2 + @string3).hash
end

end
###########################

Then I create two instances of this class:
###########################
obj1 = MyClass.new(‘a’, ‘b’, ‘c’)
obj2 = MyClass.new(‘a’, ‘b’, ‘c’)
###########################

If verify obj1 and obj2 do have the same hash value, I check that
‘obj1.hash == obj2.hash’
returns true

Now i try to use those objects as hash key:
h[obj1] = ‘abc’
h[obj2] = ‘def’

I expected h contains only one item with value ‘def’, since obj1 and
obj2
have the same hash value. But it turns out it has 2 items instead.

Did I miss something?

Jia_Pu · September 1, 2006, 5:29am

Jia Pu wrote:

def hash

Did I miss something?

You need to define #eql? in your class, since that is how Hash compares
objects that hash the same:

class MyClass
def contents
[@string1, @string2, @string3]
end

def eql?(other)
    other.class == self.class and other.contents == self.contents
end

end

Jia_Pu · September 1, 2006, 6:09am

Thanks, that works.
I am wondering the requirement of #eql? is necessary. If the programmer
wants to have non-equal objects to have same hash value, it is totally
up to
the programmer.

Jia_Pu · September 1, 2006, 6:56am

Jia Pu wrote:

Thanks, that works.
I am wondering the requirement of #eql? is necessary. If the programmer
wants to have non-equal objects to have same hash value, it is totally
up to
the programmer.

I’m not sure what you are asking, but you can define #== and #eql?
differently. In fact, ruby already does this in some cases:

irb(main):001:0> 1.0.eql? 1
=> false
irb(main):002:0> 1.0 == 1
=> true

Jia_Pu · September 1, 2006, 2:32pm

On 9/1/06, Jia Pu [email protected] wrote:

Thanks, that works.
I am wondering the requirement of #eql? is necessary. If the programmer
wants to have non-equal objects to have same hash value, it is totally up to
the programmer.
$ri Object#hash
------------------------------------------------------------ Object#hash
obj.hash => fixnum

 Generates a +Fixnum+ hash value for this object. This function must
 have the property that +a.eql?(b)+ implies +a.hash == b.hash+. The
 hash value is used by class +Hash+. Any hash value that exceeds the
 capacity of a +Fixnum+ will be truncated before being used.

Note that the requirement is that if a.eql?(b) then a.hash must be ==
b.hash

This is because hash is used as a test to avoid a potentially
expensive .eql? test, if two object’s DON’T have the same hash value
they are assumed to be not equal.

A Hash (I’m assuming here based on how hash algorithms work in
general) uses the hash of the key to compute an initial slot to store
a reference to an object when you put the object, at a particular key
in the hash with []=. If that slot is empty that’s where it goes. If
it’s not, meaning that there’s a key there with the same hash, then it
uses eql? to see if it’s the right slot, and if not it’s called a hash
collision, and the hash calculates a new slot and repeats the process.
Finding the object with a given key works basically the same way.

So the net of all this is, if you want hash to treat two different
instances of your class as the same then you must redefine eql?, and
whenever you redefine .eql? then you need to redefine hash to make
sure that the implication in the specification of Object#hash holds.

Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

Jia_Pu · September 1, 2006, 4:08pm

“Jia Pu” [email protected] writes:

Thanks, that works.
I am wondering the requirement of #eql? is necessary. If the programmer
wants to have non-equal objects to have same hash value, it is totally up to
the programmer.

I assume you are asking why defining .eql? is necessary. (You seem to
have dropped the word “why” in your question)

If two objects have the same hash value, this does not mean that they
should overwrite each other when used as keys! In fact, that would be
disasterous as hash collisions are a fact of life - has must be a
Fixnum, and there are only so many Fixnums to go around.

For example:

irb(main):001:0> h = Hash.new
=> {}
irb(main):002:0> h[1] = ‘Value for 1’
=> “Value for 1”
irb(main):003:0> h[4294967298] = ‘Value for a really big number’
=> “Value for a really big number”
irb(main):004:0> h[1]
=> “Value for 1”
irb(main):005:0> 1.hash
=> 3
irb(main):006:0> 4294967298.hash
=> 3

Fortunately, 1.eql?(4294967298) is false, so I can store different
values in my hash.

Note that the definition of .eql? and == do not need to have anything
to do with each other, or with the definition of ===. However, it is
generally very good practice to make sure that the following
implication chain holds:

(a.eql?(b)) implies (a == b) implies (a === b)

Also, eql? and == should be symmetric (that is, it should make no
difference switching a and b for those two methods).

Q: How to use non-builtin class as hash key?

So the net of all this is, if you want hash to treat two different instances of your class as the same then you must redefine eql?, and whenever you redefine .eql? then you need to redefine hash to make sure that the implication in the specification of Object#hash holds.

So the net of all this is, if you want hash to treat two different
instances of your class as the same then you must redefine eql?, and
whenever you redefine .eql? then you need to redefine hash to make
sure that the implication in the specification of Object#hash holds.