Hash Surprises with Fixnum, #hash, and #eql?

luislavena · April 7, 2011, 6:05am

Folk,

I have a class which delegates for Integer, and wants to behave as much
like a real Integer as possible (except for being able to be
subclassed).
It mostly works… but falls foul of Ruby’s various hacks, errors, and
internal optimisations in the Fixnum and Hash classes.

In particular, the Hash implementations work (and break!) differently in
MRI, Rubinius and JRuby. It’s documented to use only #hash and #eql?,
but that’s not always true (sometimes these have hard-wired
optimsations).

The Hash documentation does not say whether #eql? will be called only on
items in the hash, or only on keys being used to probe the hash. It
should
be one or the other, since a.eql?(b) might not always mean b.eql?(a).

Please peruse this code: https://gist.github.com/906998, try it on the
various Ruby versions, and also try it with the Fixnum monkey-patches
removed.

You’ll see that the behaviour is very unpredictable.

Clifford H…

Clifford_H · April 7, 2011, 11:20am

On Thu, Apr 7, 2011 at 6:05 AM, Clifford H. [email protected]
wrote:

I have a class which delegates for Integer, and wants to behave as much
like a real Integer as possible (except for being able to be subclassed).

There’s still a lot missing for a number replacement. Please see
http://blog.rubybestpractices.com/posts/rklemme/019-Complete_Numeric_Class.html

I also doubt whether it is a good idea to allow for subclassing of an
integer like class. What use case do you have in mind which would
make this necessary?

It mostly works… but falls foul of Ruby’s various hacks, errors, and
internal optimisations in the Fixnum and Hash classes.

In particular, the Hash implementations work (and break!) differently in
MRI, Rubinius and JRuby. It’s documented to use only #hash and #eql?,
but that’s not always true (sometimes these have hard-wired optimsations).

When you violate contracts you cannot expect code to work properly.

The Hash documentation does not say whether #eql? will be called only on
items in the hash, or only on keys being used to probe the hash. It should
be one or the other, since a.eql?(b) might not always mean b.eql?(a).

But that is the contract as far as I can see. Having different
results for both violates the equivalence relation which means all
bets are off.

Please peruse this code: https://gist.github.com/906998, try it on the
various Ruby versions, and also try it with the Fixnum monkey-patches
removed.

You’ll see that the behaviour is very unpredictable.

Yes, because of your violation of the contract. You have there a nice
demonstration why it is a bad idea most of the time to fiddle with
core class method implementations. They are used everywhere and you
cannot foresee the effects of changing their implementation on other
code.

Kind regards

robert

Clifford_H · April 7, 2011, 1:21pm

On Thu, Apr 7, 2011 at 11:19 AM, Robert K.
[email protected] wrote:

I also doubt whether it is a good idea to allow for subclassing of an
integer like class. What use case do you have in mind which would
make this necessary?

Complex numbers come to mind:

3 - 2j [+|*] 45^(j * e * 44).

Very different semantics for addition and multiplication of those than
for your normal space numbers, including conversion from Cartesian to
polar form. Bit of a textbook case for the benefits of inheritance and
function overloading.

It’d be better to have those be a sub-class of Float, though.

–
Phillip G.

Though the folk I have met,
(Ah, how soon!) they forget
When I’ve moved on to some other place,
There may be one or two,
When I’ve played and passed through,
Who’ll remember my song or my face.

Clifford_H · April 7, 2011, 1:54pm

On Thu, Apr 7, 2011 at 1:20 PM, Phillip G.
[email protected] wrote:

On Thu, Apr 7, 2011 at 11:19 AM, Robert K.
[email protected] wrote:

I also doubt whether it is a good idea to allow for subclassing of an
integer like class. What use case do you have in mind which would
make this necessary?

Complex numbers come to mind:

Why bother, it has been done already.

irb(main):006:0> x = Complex(0,-1)
=> (0-1i)
irb(main):007:0> x * x
=> (-1+0i)
irb(main):008:0> (x * x)+0
=> (-1+0i)
irb(main):009:0> (x * x).to_int
=> -1

And they do play nicely as ints - as long as it’s possible:

irb(main):011:0> %w{foo bar baz}[x*x]
=> “baz”
irb(main):012:0> %w{foo bar baz}[x]
RangeError: can’t convert 0-1i into Integer
from (irb):12:in to_i' from (irb):12:in to_int’
from (irb):12:in []' from (irb):12 from /opt/bin/irb19:12:in ’

3 - 2j [+|*] 45^(j * e * 44).

Very different semantics for addition and multiplication of those than
for your normal space numbers, including conversion from Cartesian to
polar form. Bit of a textbook case for the benefits of inheritance and
function overloading.

It’d be better to have those be a sub-class of Float, though.

Actually it’s Numeric which is correct because not every Complex is a
Float!

irb(main):010:0> Complex.ancestors
=> [Complex, Numeric, Comparable, Object, Kernel, BasicObject]

Cheers

robert

Clifford_H · April 7, 2011, 3:21pm

On Thu, Apr 7, 2011 at 1:53 PM, Robert K.
[email protected] wrote:

Actually it’s Numeric which is correct because not every Complex is a Float!

Unless you convert from polar form* to Cartesian form:

Let’s take “45e^(j * 44)”:

32.37029101523930127102246035055203587038080515238845970760… +
31.25962667065487789953828347402088034639160866937838052053… i

From http://www.wolframalpha.com/input/?i=45*e^(44°i).

Which you are much more likely to encounter than the Cartesian form,
considering complex numbers are most useful when dealing with wave
forms.

–
Phillip G.

Though the folk I have met,
(Ah, how soon!) they forget
When I’ve moved on to some other place,
There may be one or two,
When I’ve played and passed through,
Who’ll remember my song or my face.

Clifford_H · April 7, 2011, 3:44pm

On Thu, Apr 7, 2011 at 3:21 PM, Phillip G.
[email protected] wrote:

31.25962667065487789953828347402088034639160866937838052053… i

From http://www.wolframalpha.com/input/?i=45*e^(44°i).

Which you are much more likely to encounter than the Cartesian form,
considering complex numbers are most useful when dealing with wave
forms.

My math is a bit rusty in that area, but I don’t think your argument
holds: cartesian and polar are just two ways to represent a complex
number. But this does not change numeric properties of the class of
complex numbers. No matter what representation you use, (0+1i) is
neither a real number (what float conceptually models) nor a rational
number (what float technically implements). Instead, real and
rational numbers are both subsets of complex.

If you see inheritance as “is a” relationship then inheritance would
be Rational < Real < Complex but not the other way round. Otherwise
you cannot use a subclass instance everywhere you were using a
superclass instance.

Further links:

Kind regards

robert

Clifford_H · April 7, 2011, 4:32pm

On Thu, Apr 7, 2011 at 4:02 PM, Phillip G.
[email protected] wrote:

On Thu, Apr 7, 2011 at 3:43 PM, Robert K.
[email protected] wrote:

If you see inheritance as “is a” relationship then inheritance would
be Rational < Real < Complex but not the other way round. Otherwise
you cannot use a subclass instance everywhere you were using a
superclass instance.

Though, does the “is a” relationship hold up? I think it’s more of a
“kind of” relationship, where subsequent classes are defined in ever
more detail (so, you’d inherit Floats from Integers, and Complex from
Float).

Well, even with technical inheritance (“kind of”) sub often add state
(i.e. member variables) but do only restrict valid values of
superclass state if at all. The cannot do otherwise because then
superclass methods may break. Silly example: superclass holds an
index which must be >= 0. All superclass methods use that index for
some kind of lookup. Assuming a sub class would suddenly set that
value to -13 the superclass contract would be violated. Now, if you
let Complex inherit from Real (trying to avoid “irrational” :-)) you
would add another field for imaginary part. So far so good, but
method to_f would sometimes throw an exception in Complex which it
would never do in Real. So suddenly Complex breaks Real’s contract.

Of course, all those considerations are far less important in a nicely
duck typed language like Ruby compared to a statically typed language.
Assuming you would do the same in Java you would have to declare the
exception (if you use checked exceptions) on Real class but state at
the same time that this class would never throw it. Even worse, all
code using Real would have to deal with this exception by either
catching or propagating it. Not nice.

That’s why I prefer to look at inheritance as “is a” relationship:
after all OO is about better abstraction capabilities and to be able
to hide implementation details behind a clearly defined clean
interface. If you let yourself get dragged too much into technical
issues chances are that the design comes out awful. Only languages
which allow to inherit without publishing all features of the
inherited class (private inheritance e.g. in Eiffel) do not
necessarily suffer from these issues. But then, inheritance is just
an implementation detail in such cases.

Of course, the clean world of maths doesn’t map 1:1 to computational
systems, so I see the value in both approaches.

That’s true. I remember debates about the very question how to model
inheritance hierarchies for numeric types. Unfortunately I can’t
produce a reference right now. Maybe someone else can.

But, frankly, given the differences and additional properties of
complex numbers, I’d derive it from Numeric as well, simply to limit
the side effects the other numeric classes introduce (Floats and their
CPU-internal representation give me nightmares :P).

Also, with Ruby’s concept of coercion inheritance between numeric
types is probably less of an issue.

Kind regards

robert

Clifford_H · April 7, 2011, 4:50pm

On Thu, Apr 7, 2011 at 4:28 PM, Robert K.
[email protected] wrote:

method to_f would sometimes throw an exception in Complex which it
would never do in Real. So suddenly Complex breaks Real’s contract.

But that’s a failure of implementation, isn’t it?

If I were to implement my own class Complex, I’d have to deal with the
edge-cases that my sub-class has and can produce.

Thus, I either undefine #to_f, or redefine it so that it throws an
Exception. the value of inheritance is, after all, generalization, so
that I don’t have to reimplement the wheel all the time, instead
making the wheel bigger or smaller, as the implementation requires.
That conversely also means that that more specialized sub-classes
derived from a generic-er super-class, has to implement an interface
that works, and works consistently.

To stay with Complex as an example:
#to_f would require an additional argument to work properly: Either
convert the real, or the imaginary part into a Float, and so would
anything derived from Complex, whatever that may be, if it has the
same properties.

That’s why I prefer to look at inheritance as “is a” relationship:
after all OO is about better abstraction capabilities and to be able
to hide implementation details behind a clearly defined clean
interface. If you let yourself get dragged too much into technical
issues chances are that the design comes out awful. Only languages
which allow to inherit without publishing all features of the
inherited class (private inheritance e.g. in Eiffel) do not
necessarily suffer from these issues. But then, inheritance is just
an implementation detail in such cases.

But isn’t it always?

Regarding technical issues: Design is a bit of an art; knowing when to
stop abstracting is important.

–
Phillip G.

Though the folk I have met,
(Ah, how soon!) they forget
When I’ve moved on to some other place,
There may be one or two,
When I’ve played and passed through,
Who’ll remember my song or my face.

Clifford_H · April 7, 2011, 4:03pm

On Thu, Apr 7, 2011 at 3:43 PM, Robert K.
[email protected] wrote:

My math is a bit rusty in that area, but I don’t think your argument
holds: cartesian and polar are just two ways to represent a complex
number. But this does not change numeric properties of the class of
complex numbers. No matter what representation you use, (0+1i) is
neither a real number (what float conceptually models) nor a rational
number (what float technically implements). Instead, real and
rational numbers are both subsets of complex.

Given that there are infinitely more irrational than rational numbers,
it’s much more common to represent a complex number with irrational
numbers than rational ones (i.e. floats instead of integers). Thus,
most (for want of a better word) real mathematical operations done
with complex numbers are done with irrational numbers. Cartesian and
polar forms make certain mathematical operations easier, but that’s
more or less it (the truth is more complex, but I CBA to look into
trascendental numbers, Euler’s number, &c.).

And yes, both rational and irrational numbers are subsets of complex,
obviously (with rational numbers being a subset of irrational numbers,
to simplify extremely).

If you see inheritance as “is a” relationship then inheritance would
be Rational < Real < Complex but not the other way round. Otherwise
you cannot use a subclass instance everywhere you were using a
superclass instance.

Though, does the “is a” relationship hold up? I think it’s more of a
“kind of” relationship, where subsequent classes are defined in ever
more detail (so, you’d inherit Floats from Integers, and Complex from
Float).

Of course, the clean world of maths doesn’t map 1:1 to computational
systems, so I see the value in both approaches.

But, frankly, given the differences and additional properties of
complex numbers, I’d derive it from Numeric as well, simply to limit
the side effects the other numeric classes introduce (Floats and their
CPU-internal representation give me nightmares :P).

–
Phillip G.

Though the folk I have met,
(Ah, how soon!) they forget
When I’ve moved on to some other place,
There may be one or two,
When I’ve played and passed through,
Who’ll remember my song or my face.

Clifford_H · April 7, 2011, 5:32pm

On Thu, Apr 7, 2011 at 5:11 PM, Robert K.
[email protected] wrote:

But we were talking about the case of Complex inheriting Float. Your
arguments do not make sense with a Float class because there is no
imaginary part yet you would need them in order to be able to provide
a meaningful to_f in the subclass Complex.

That’s why Complex adds to #to_f, to enable the #to_f functionality
(or raise an error, when it just doesn’t make sense to have a
function). It’s method overloading.

Thank you for the interesting discussion!

My pleasure.

We can probably go back and forth over the benefits of the approaches,
but it’s largely academical now, I think.

PS: I just read that there was another earthquake in Japan and there
is a tsunami warning. I hope the best for everybody in that area and
I hope these catastrophes end rather sooner than later.

Aye.

–
Phillip G.

Though the folk I have met,
(Ah, how soon!) they forget
When I’ve moved on to some other place,
There may be one or two,
When I’ve played and passed through,
Who’ll remember my song or my face.

Clifford_H · April 7, 2011, 5:12pm

On Thu, Apr 7, 2011 at 4:49 PM, Phillip G.
[email protected] wrote:

let Complex inherit from Real (trying to avoid “irrational” :-)) you
would add another field for imaginary part. So far so good, but
method to_f would sometimes throw an exception in Complex which it
would never do in Real. So suddenly Complex breaks Real’s contract.

But that’s a failure of implementation, isn’t it?

That too, but the root cause lies in the area of the incompatibility
of the design with the properties of numerical classes.

To stay with Complex as an example:
#to_f would require an additional argument to work properly: Either
convert the real, or the imaginary part into a Float, and so would
anything derived from Complex, whatever that may be, if it has the
same properties.

But we were talking about the case of Complex inheriting Float. Your
arguments do not make sense with a Float class because there is no
imaginary part yet you would need them in order to be able to provide
a meaningful to_f in the subclass Complex.

But isn’t it always?
If you treat it as such it is. However then you using a powerful
feature for abstraction and modeling.

Regarding technical issues: Design is a bit of an art; knowing when to
stop abstracting is important.

In my experience far too many people in our profession have the other
problem: they dive into details too fast and do not think on an
abstract level. That’s the reason why so much code I get to see has
issues. And since design flaws are generally much more costly to
repair than mere technical issues I’d rather say people should learn
to start abstracting.

Thank you for the interesting discussion!

Cheers

robert

PS: I just read that there was another earthquake in Japan and there
is a tsunami warning. I hope the best for everybody in that area and
I hope these catastrophes end rather sooner than later.

Clifford_H · April 7, 2011, 6:06pm

Phillip G. wrote in post #991471:

If you see inheritance as “is a” relationship then inheritance would
be Rational < Real < Complex but not the other way round. Otherwise
you cannot use a subclass instance everywhere you were using a
superclass instance.

Though, does the “is a” relationship hold up? I think it’s more of a
“kind of” relationship, where subsequent classes are defined in ever
more detail (so, you’d inherit Floats from Integers, and Complex from
Float).

Isn’t this the old “ellipse is_a circle, or vice versa” debate?

If you make Circle the top class, then Ellipse reimplements pretty much
everything (draw, area, etc); there’s no useful code sharing. If you
make Ellipse the top class, then Circle is just a special constrained
case of Ellipse.

Translating to the current discussion, substitute Float for Circle and
Ellipse for Complex.

Ruby’s answer is: neither is a subclass of the other. Both inherit from
Numeric. That is, Circle and Ellipse are both a Shape. Or in other
words, “who cares”?

Eventually you come to realise that a lot of what is taught in object
oriented classes and textbooks is tosh

Clifford_H · April 8, 2011, 2:25am

On Apr 7, 7:01am, Clifford H. [email protected] wrote:

Clifford_H · April 8, 2011, 12:43am

On 2011-04-07, at 09:06, Brian C. wrote:

Eventually you come to realise that a lot of what is taught in object
oriented classes and textbooks is tosh

And a lot of what is done by practitioners using object-oriented
languages
is tosh as well, I know, I’ve seen it. (You haven’t lived until you’ve
had
to review a C++ class with 9-way multiple inheritance, without using
rude
words!)

The Circle and Ellipse example is a good one. In fact, a Circle is no
more
than an Ellipse with a constraint (eccentricity = 0, or, equivalently,
the
two foci (`focuses’) of the Ellipse are at the same point). So in almost
all cases, I wouldn’t have two separate classes, but one, Ellipse.

In the case of Complex and Float, the operative design principle is the
Liskov Substitution Principle, which can be roughly stated in OO form as
`you can derive class Sub from class Super if and only if every instance
of Sub can be regarded as an instance of Super’.

Thus it’s perfectly reasonable to derive JetPlane from Airplane, because
every JetPlane should be able to respond to all Airplane operations.
However, you can’t derive Airplane from Wheel, or Wheel from Airplane,
even though there is some connection between wheels and planes. Like all
design principles, there are exceptional cases where the LSP doesn’t
apply,
but it seems to be the best heuristic for permissible subclassing.

In the Complex/Float case, the LSP tells us that we could consider
Float
a subclass of Complex (because every Float is, as has been pointed out,
a
Complex with an imaginary part of zero), but that Complex can’t
reasonably
be considered a subclass of Float. A good designer would go further and
say
`yes, the LSP allows me to derive Float from Complex, but that’s a waste
of
storage, because it means I must store imaginary parts that are always
zero.’
Thus a better design (which Ruby follows) derives Complex from Numeric.

That’s the OO theory, and it’s not tosh

– vincent manis

Clifford_H · April 8, 2011, 8:57am

On Thu, Apr 7, 2011 at 6:06 PM, Brian C. [email protected]
wrote:

Ruby’s answer is: neither is a subclass of the other. Both inherit from
Numeric. That is, Circle and Ellipse are both a Shape. Or in other
words, “who cares”?

ACK, ACK and ACK.

Eventually you come to realise that a lot of what is taught in object
oriented classes and textbooks is tosh

I’d rather say they are incomplete. You need those simple examples to
explain what inheritance is for but often books stop right there.
Looking at inheritance on a very abstract level is worthwhile to start
musing about inheritance and how it can best be utilized. But you
then need to progress discussing technical aspects etc. like we do
here. But that is more difficult and complex and maybe some authors
are lazy, not aware of the complexity or do not delve into this for
other reasons. I found Betrand Meyer’s “Object Oriented Software
Construction” very comprehensive as he covers a lot aspects of
inheritance. I would readily recommend it to anyone who wants to dive
a bit deeper into the matter.

Kind regards

robert

Clifford_H · April 8, 2011, 9:31am

On 04/07/11 19:19, Robert K. wrote:

On Thu, Apr 7, 2011 at 6:05 AM, Clifford H.[email protected] wrote:

I have a class which delegates for Integer, and wants to behave as much
like a real Integer as possible (except for being able to be subclassed).
There’s still a lot missing for a number replacement. Please see
http://blog.rubybestpractices.com/posts/rklemme/019-Complete_Numeric_Class.html

Yes, you wrote that about the time we discussed it last time.

I also doubt whether it is a good idea to allow for subclassing of an
integer like class. What use case do you have in mind which would
make this necessary?

What’s wrong with the case you use in that blog post? But as it turns
out,
I’m implementing a fact-based modeling DSL, where it’s sensible to have
classes like “AgeInYears” being a subclass of an integer like class.
The formalism for this comes directly from sorted first-order logic,
which makes a good deal more sense than the broken O-O paradigm
discussed
elsewhere in this thread.

I suspect that you “doubt it is a good idea” only because Ruby’s object
model for numbers is inconsistent, and you’re defensive about that. Not
because Ruby 2.0 shouldn’t move in the direction of fixing it, where
possible. (BTW, I tried to join Ruby Core to discuss this, but all
possible means of subscription are silently failing me).

Note that I’m not actually subclassing any core integer class. I’m just
defining a new base class “Int” which contains an integer, and so far
as is possible, acts like one, including being found in a Hash using a
Fixnum/Bignum key.

If Fixnum and Bignum can act like Integer subclasses, why can’t my
class?

In particular, the Hash implementations work (and break!) differently in
MRI, Rubinius and JRuby. It’s documented to use only #hash and #eql?,
but that’s not always true (sometimes these have hard-wired optimsations).
When you violate contracts you cannot expect code to work properly.

I have not violated that (unstated!) contract. Read again; I redefine
Fixnum#eql? as self.orig_eql?(i.to_i) - the to_i makes it symmetrical.
(Debate the wisdom if you wish, it’s just for demonstration purposes.)

However the Ruby interpreters do not honor that. In short, all three
mentioned Ruby interpreters violate the Hash contract, which states that
hash and eql? are used for Hash lookups. Not just sometimes, but all the
time, including for integers.

MRI uses a Fixnum as its own hash value, even if you’ve monkey-patched a
hash method into Fixnum. This optimisation should not be always-on.
Instead,
Ruby should detect when Fixnum has been patched, and bypass the
optimisation.
That would require a single test and branch, with insignificant impact
on
performance. MRI does however use a monkey-patched Fixnum#eql? method.

Rubinius does the opposite. It calls a patched Fixnum#hash, but not
Fixnum#eql?

JRuby calls neither.

The Ruby interpreters should behave the way the Hash documentation says
they do.

The Ruby documentation should explicitly state that eql? must be defined
symmetrically, or should require that the Hash implementation uses it
only
in a known direction or both.

The Hash documentation does not say whether #eql? will be called only on
items in the hash, or only on keys being used to probe the hash. It should
be one or the other, since a.eql?(b) might not always mean b.eql?(a).

But that is the contract as far as I can see.

That’s not documented anywhere I can see. Certainly not in TRPL, see
sections
3.4.2 on page 68, and section 3.8.5.3 page 77. It makes sense, but it’s
not
stated.

Having different
results for both violates the equivalence relation which means all
bets are off.

No. I can fix the asymmetry. I can’t make the interpreters honor that
fix.

You’ll see that the behaviour is very unpredictable.
Yes, because of your violation of the contract.

No. Because the Ruby interpreters don’t honor the Hash contract.

Please try to read more carefully.

Clifford H…

Clifford_H · April 9, 2011, 2:00am

El 8 Apr 2011, a les 09:30, Clifford H. [email protected] va
escriure:

I suspect that you “doubt it is a good idea” only because Ruby’s object
model for numbers is inconsistent, and you’re defensive about that. Not
because Ruby 2.0 shouldn’t move in the direction of fixing it, where
possible. (BTW, I tried to join Ruby Core to discuss this, but all
possible means of subscription are silently failing me).

Just in case, that happened also to me. Problem was all confirmation
mails were in the spam folder.

Clifford_H · April 9, 2011, 3:26am

On 04/09/11 10:00, Xavier N. wrote:

El 8 Apr 2011, a les 09:30, Clifford H.[email protected] va escriure:

(BTW, I tried to join Ruby Core to discuss this, but all
possible means of subscription are silently failing me).

Just in case, that happened also to me. Problem was all confirmation mails were
in the spam folder.

Thanks, that was the problem (/me hides face).
A few more attempts and I’m subscribed.

Clifford_H · April 8, 2011, 12:13pm

On Fri, Apr 8, 2011 at 9:30 AM, Clifford H. [email protected]
wrote:

On 04/07/11 19:19, Robert K. wrote:

On Thu, Apr 7, 2011 at 6:05 AM, Clifford H.[email protected] wrote:

I also doubt whether it is a good idea to allow for subclassing of an
integer like class. What use case do you have in mind which would
make this necessary?

What’s wrong with the case you use in that blog post?

You mean, make HexNum a subclass of Integer? Yes, actually that’s
what I had attempted at the time but failed for technical reasons
(explained in the blog). As it turns out it’s generally not necessary
to inherit Integer in Ruby to create a class which behaves like an
integer (most of the time).

But as it turns out,
I’m implementing a fact-based modeling DSL, where it’s sensible to have
classes like “AgeInYears” being a subclass of an integer like class.
The formalism for this comes directly from sorted first-order logic,
which makes a good deal more sense than the broken O-O paradigm discussed
elsewhere in this thread.

I suspect that you “doubt it is a good idea” only because Ruby’s object
model for numbers is inconsistent, and you’re defensive about that.

Where exactly do you see the inconsistency? I can see that a few
things in that area do not match common expectations. But I don’t
think it’s really inconsistent.

Your problem is not so much with numeric classes IMHO but rather with
implementations of class Hash in different versions of Ruby. Namely
do they have issues treating instances from different class as
equivalent.

Note that I’m not actually subclassing any core integer class. I’m just
defining a new base class “Int” which contains an integer, and so far
as is possible, acts like one, including being found in a Hash using a
Fixnum/Bignum key.

If Fixnum and Bignum can act like Integer subclasses, why can’t my class?

Fixnum and Bignum do not share common values so you never have
instances of different classes representing the same numeric integer
value:

irb(main):003:0> (1<<100).class
=> Bignum
irb(main):004:0> (1<<100)>>99
=> 2
irb(main):005:0> ((1<<100)>>99).class
=> Fixnum

So that situation is a bit different.

In particular, the Hash implementations work (and break!) differently in
MRI, Rubinius and JRuby. It’s documented to use only #hash and #eql?,
but that’s not always true (sometimes these have hard-wired
optimsations).

When you violate contracts you cannot expect code to work properly.

I have not violated that (unstated!) contract. Read again; I redefine
Fixnum#eql? as self.orig_eql?(i.to_i) - the to_i makes it symmetrical.
(Debate the wisdom if you wish, it’s just for demonstration purposes.)

Yes, you’re right. I probably mixed in a discussion about equals() in
Java needing to test for the same class (and not instanceof) to
achieve real equivalence. At least we had a nice discussion about OO
and inheritance because of that.

However the Ruby interpreters do not honor that. In short, all three
mentioned Ruby interpreters violate the Hash contract, which states that
hash and eql? are used for Hash lookups. Not just sometimes, but all the
time, including for integers.

Apparently there are optimizations done under the hood (similarly to
duping an unfrozen String as key) which is probably OK from a
pragmatic point of view (what you attempt seems rather seldom done).

The Ruby interpreters should behave the way the Hash documentation says they
do.

Well, they do - most of the time.

The Ruby documentation should explicitly state that eql? must be defined
symmetrically, or should require that the Hash implementation uses it only
in a known direction or both.

Right, there is certainly room for improvement.

stated.
Right again. Maybe the requirement can be inferred from other
properties but it would certainly make sense to stress it.

Having different
results for both violates the equivalence relation which means all
bets are off.

No. I can fix the asymmetry. I can’t make the interpreters honor that fix.

But since you already embarked in monkey patching core classes you can
easily extend that a bit to Hash#[] and Hash#fetch. That should work
on all platforms. Another possible remedy would be to wrap Hash
instances in something else which adds logic to [] and fetch() to
convert types if necessary.

You’ll see that the behaviour is very unpredictable.

Yes, because of your violation of the contract.

No. Because the Ruby interpreters don’t honor the Hash contract.

As said, they do it most of the time. You introduced a corner case
here by fiddling with a core class which is known to lead into deep
water. Treating instances from different classes equivalent does work
for other classes:

12:07:22 Temp$ allruby ha.rb
CYGWIN_NT-5.1 padrklemme2 1.7.9(0.237/5/3) 2011-03-29 10:10 i686 Cygwin

ruby 1.8.7 (2008-08-11 patchlevel 72) [i386-cygwin]
[[#<B:0x7ff9faa4 @v=1>, 3, 1],
[#<A:0x7ff9fa90 @v=2>, 5, 2],
[1, 3, nil],
[2, 5, nil]]

ruby 1.9.2p180 (2011-02-18 revision 30909) [i386-cygwin]
[[#<B:0x1003a290 @v=1>, 129497438, 1],
[#<A:0x1003a27c @v=2>, 602294680, 2],
[1, 129497438, nil],
[2, 602294680, nil]]

jruby 1.6.0 (ruby 1.8.7 patchlevel 330) (2011-03-15 f3b6154) (Java
HotSpot™ Client VM 1.6.0_24) [Windows XP-x86-java]
[[#<B:0xfa8 @v=1>, 1, 1], [#<A:0xfb0 @v=2>, 2, 2], [1, 1, nil], [2, 2,
nil]]

jruby 1.6.0 (ruby 1.9.2 patchlevel 136) (2011-03-15 f3b6154) (Java
HotSpot™ Client VM 1.6.0_24) [Windows XP-x86-java]
[[#<B:0x000000 @v=1>, 1, 1],
[#<A:0x000000 @v=2>, 2, 2],
[1, 1, nil],
[2, 2, nil]]

12:07:36 Temp$ cat -n ha.rb
1
2 require ‘pp’
3
4 A, B = 2.times.map do
5 Class.new do
6 def initialize(x)
7 @v = x.to_i
8 end
9
10 def to_i
11 @v
12 end
13
14 def hash
15 @v.hash
16 end
17
18 def eql? o
19 case o
20 when A, B
21 @v == o.to_i
22 end
23 end
24
25 alias == eql?
26 end
27 end
28
29 h = {A.new(1) => 1, B.new(2) => 2}
30
31 keys = [B.new(1), A.new(2), 1, 2]
32
33 pp keys.map {|k| [k, k.hash, h[k]]}
34
35
12:07:40 Temp$

Please try to read more carefully.

Will do.

Cheers

robert

Clifford_H · April 9, 2011, 3:27am

On 04/08/11 20:12, Robert K. wrote:

On Fri, Apr 8, 2011 at 9:30 AM, Clifford H.[email protected] wrote:

On 04/07/11 19:19, Robert K. wrote:

On Thu, Apr 7, 2011 at 6:05 AM, Clifford H.[email protected] wrote:
I also doubt whether it is a good idea to allow for subclassing of an
integer like class. What use case do you have in mind which would
make this necessary?
What’s wrong with the case you use in that blog post?
You mean, make HexNum a subclass of Integer? Yes, actually that’s
what I had attempted at the time but failed for technical reasons

No, I don’t mean making HexNum a subclass of Integer, but making it an
“integer like class” which can be subclassed.

(explained in the blog). As it turns out it’s generally not necessary
to inherit Integer in Ruby to create a class which behaves like an
integer (most of the time).

Right. I’d like to see that work more of the time :). Or at least,
that each Ruby interpreter should fail in the same way.

I suspect that you “doubt it is a good idea” only because Ruby’s object
model for numbers is inconsistent, and you’re defensive about that.
Where exactly do you see the inconsistency? I can see that a few
things in that area do not match common expectations. But I don’t
think it’s really inconsistent.

By inconsistent, I mean that Ruby doesn’t make it possible to make
subclasses of Integer that play nicely with other Integers. Fixnum
and Bignum are mutually compatible and automatically and invisibly
convert back and forth, but it’s not possible for an user’s class to
do the same. That’s inconsistent. A few more calls to coerce and some
more circumspect interpreter optimisations and it would all be pretty
ok.

Note that I expect there will still be a need for Java-style boxed
and unboxed integer values. C# makes the boxing even more transparent
than Java, but Ruby doesn’t even try.

Your problem is not so much with numeric classes IMHO but rather with
implementations of class Hash in different versions of Ruby. Namely
do they have issues treating instances from different class as
equivalent.

Yes. It’s documented to use #hash and #eql?, so that’s what it should
do.
If it also has invisible optimisations, fine. So long as they’re
invisible.

Fixnum and Bignum do not share common values so you never have
instances of different classes representing the same numeric integer
value:

Yes. But I never need to know where the cut-over is, and it can be
different
with different Ruby build targets. It’s almost completely transparent.

Apparently there are optimizations done under the hood (similarly to
duping an unfrozen String as key)

Except that the case of String is documented, and works the same in all
interpreters.

which is probably OK from a
pragmatic point of view (what you attempt seems rather seldom done).

Mainly because it doesn’t work

Clifford H…

Hash Surprises with Fixnum, #hash, and #eql?

12:07:22 Temp$ allruby ha.rb CYGWIN_NT-5.1 padrklemme2 1.7.9(0.237/5/3) 2011-03-29 10:10 i686 Cygwin

ruby 1.8.7 (2008-08-11 patchlevel 72) [i386-cygwin] [[#<B:0x7ff9faa4 @v=1>, 3, 1], [#<A:0x7ff9fa90 @v=2>, 5, 2], [1, 3, nil], [2, 5, nil]]

ruby 1.9.2p180 (2011-02-18 revision 30909) [i386-cygwin] [[#<B:0x1003a290 @v=1>, 129497438, 1], [#<A:0x1003a27c @v=2>, 602294680, 2], [1, 129497438, nil], [2, 602294680, nil]]

jruby 1.6.0 (ruby 1.8.7 patchlevel 330) (2011-03-15 f3b6154) (Java HotSpot™ Client VM 1.6.0_24) [Windows XP-x86-java] [[#<B:0xfa8 @v=1>, 1, 1], [#<A:0xfb0 @v=2>, 2, 2], [1, 1, nil], [2, 2, nil]]

jruby 1.6.0 (ruby 1.9.2 patchlevel 136) (2011-03-15 f3b6154) (Java HotSpot™ Client VM 1.6.0_24) [Windows XP-x86-java] [[#<B:0x000000 @v=1>, 1, 1], [#<A:0x000000 @v=2>, 2, 2], [1, 1, nil], [2, 2, nil]]

12:07:22 Temp$ allruby ha.rb
CYGWIN_NT-5.1 padrklemme2 1.7.9(0.237/5/3) 2011-03-29 10:10 i686 Cygwin

ruby 1.8.7 (2008-08-11 patchlevel 72) [i386-cygwin]
[[#<B:0x7ff9faa4 @v=1>, 3, 1],
[#<A:0x7ff9fa90 @v=2>, 5, 2],
[1, 3, nil],
[2, 5, nil]]

ruby 1.9.2p180 (2011-02-18 revision 30909) [i386-cygwin]
[[#<B:0x1003a290 @v=1>, 129497438, 1],
[#<A:0x1003a27c @v=2>, 602294680, 2],
[1, 129497438, nil],
[2, 602294680, nil]]

jruby 1.6.0 (ruby 1.8.7 patchlevel 330) (2011-03-15 f3b6154) (Java
HotSpot™ Client VM 1.6.0_24) [Windows XP-x86-java]
[[#<B:0xfa8 @v=1>, 1, 1], [#<A:0xfb0 @v=2>, 2, 2], [1, 1, nil], [2, 2,
nil]]

jruby 1.6.0 (ruby 1.9.2 patchlevel 136) (2011-03-15 f3b6154) (Java
HotSpot™ Client VM 1.6.0_24) [Windows XP-x86-java]
[[#<B:0x000000 @v=1>, 1, 1],
[#<A:0x000000 @v=2>, 2, 2],
[1, 1, nil],
[2, 2, nil]]