Re: Ruby and Java equality usage

MolitorS_Stephen_L · June 27, 2006, 9:01pm

Jacob F. wrote:

Is it true then that a.eql?(b) only if a.hash == b.hash?

Yes. But not necessarily the vice versa. If a.eql?(b) is true, then
a.hash == b.hash must also be true. And if a.hash != b.hash, then
a.eql?(b) is necessarily false. However if a.hash == b.hash, then
a.eql?(b) may be true or false. (The above is all assuming proper
implementations of hash and eql?.)

What’s the damage in a.eql?(b) returning true when in different
buckets?

That will never happen. If a and b have different hashes then they go
in different buckets. And if their hashes are different they are not
equivalent (not eql?).

Why doesn’t the default implementation of Object#eql? just use #==
internally?

Good question! Or to put it another way why do we need separate == and
eql? methods? The only rationale I can think of is to completely
separate the hash related methods (eql? and hash) from ==. So I can
override == knowing that I won’t be changing hash behavior at all.
Except that if I’m overriding == I would usually want to change the hash
related behaviors to be consistent with my new concept of equivalence!

Steve

MolitorS_Stephen_L · June 27, 2006, 10:50pm

On 6/27/06, Molitor, Stephen L [email protected] wrote:

Jacob F. wrote:

Is it true then that a.eql?(b) only if a.hash == b.hash?

Yes. But not necessarily the vice versa. If a.eql?(b) is true, then
a.hash == b.hash must also be true. And if a.hash != b.hash, then
a.eql?(b) is necessarily false. However if a.hash == b.hash, then
a.eql?(b) may be true or false. (The above is all assuming proper
implementations of hash and eql?.)

I understand the “not vice versa” (that’s why I used “only if” in the
clause above :). I’m still not convinced that a.eql?(b) must imply
a.hash == b.hash. I’m willing to be convinced, but I haven’t heard yet
an argument as to why – only arguments along the lines of “that’s
the way it is”. The way I see it, from looking through hash.c and st.c
in the core, there is no contract on the value of a.eql?(b) when
a.hash != b.hash. As far as I can tell, implementing MyClass#eql? as:

class MyClass
def eql?(other)
raise “Oops!” unless self.hash == other.hash
return false
end
end

will never raise the exception in normal use of the Hash class (I’m
assuming that people follow the rules and aren’t using eql?
explicitly). That’s why I asked the following:

What’s the damage in a.eql?(b) returning true when in different
buckets?

That will never happen. If a and b have different hashes then they go
in different buckets. And if their hashes are different they are not
equivalent (not eql?).

But it will happen, if I don’t follow your definition of a “proper”
implementation of eql?. I can understand the pushback in your second
sentence. Semantically, it doesn’t make sense for two objects that
claim to be eql? to be hashed into different buckets. But relying on
the exclusive manner in which eql? is called, that shouldn’t matter

It’s the same as implementing a function (math) with a limited
domain. What’s the output of the function (code) when I give it an
invalid input? If I can guarantee that an invalid input will never be
given, it doesn’t matter.

Why doesn’t the default implementation of Object#eql? just use #==
internally?

Good question! Or to put it another way why do we need separate == and
eql? methods? The only rationale I can think of is to completely
separate the hash related methods (eql? and hash) from ==. So I can
override == knowing that I won’t be changing hash behavior at all.
Except that if I’m overriding == I would usually want to change the hash
related behaviors to be consistent with my new concept of equivalence!

I disagree that questioning the existence of eql? is the same as what
I’m suggesting above. I understand perfectly well why eql? exists
separate from ==. eql? serves exactly one purpose: determining whether
two objects that get hashed into the same bucket should be considered
the same key or a different key. I can understand as well that there
are times when that determination will be separate from the
determination of whether the objects are “equal” (==). So I do believe
that the eql? method is necessary.

However, I don’t see why – apart from purity in the insistence that
a.eql?(b) only if a.hash == b.hash – the default behavior of eql?
can’t be the same as ==. And since I only see that one reason –
purity – this is my question:

Would it introduce bugs if a.eql?(b) were to return true when a.hash !=
b.hash?

Jacob F.

MolitorS_Stephen_L · June 27, 2006, 11:10pm

On Jun 27, 2006, at 2:59 PM, Molitor, Stephen L wrote:

Why doesn’t the default implementation of Object#eql? just use #==
internally?

Good question! Or to put it another way why do we need separate ==
and
eql? methods? The only rationale I can think of is to completely
separate the hash related methods (eql? and hash) from ==.

Here is my take on the equality methods:

a.equal?(b) # identity: do a and b reference the same object?
a.eql?(b) # representation: do a and b reference objects with the
same value and same representation?
a.==(b) # equivalence: do a and b reference objects with the same
value regardless of representation?
g.===(b) # membership: does b belong to the ‘group’ g?

3.eql?(3.0) is false because two different representations are
being compared
3 == 3.0 is true because the same value is represented by
both objects
0 == Complex(0,0) is true, same value, different representation

String === “s” is true because “s” is member of the collection of
all Strings
0…10 === 4 is true because 4 is a ‘member’ of the range
/[ab]/ === ‘a’ is true because a is a ‘member’ of the strings
matched by the re

There is also include? which is defined for quite a few classes and
which is very
similar to === in many cases.

Don’t forget the ‘=~’ operator which is like === but returns an index
within the
group, which is reminiscent of Array#index, for example.

It seems to me that with a little work, ===, =~, index, and include?
semantics could be
made a bit more uniform. For example:

s = Set.new [1,2,3]
s.include? 1		# true
s === 1			# false, why not behave like include? here

a = [:apple, :banana, :cherry]
a.index(:banana)	# 1
a =~ :banana		# false  (why not return 1?)

a.include?(:banana)	# true
a === :banana		# false  (why not return true?)

I suspect it might be difficult to avoid breaking existing code if
these changes
were made.

Gary W.