Forum: Ruby-core [Feature #1697] Object#<=>

Posted by Marc-Andre Lafortune (Guest)
on 2009-06-28 03:49
(Received via mailing list)
Feature #1697: Object#<=>
http://redmine.ruby-lang.org/issues/show/1697

Author: Marc-Andre Lafortune
Status: Open, Priority: Normal
Assigned to: Yukihiro Matsumoto, Category: core, Target version: 1.9.2

The definition of the <=> operator states:
"... And if the two operands are not comparable, it returns nil"  (The 
Ruby Programming Language, O'Reilly)

Attempting to compare objects, when one or both do not define the <=> 
operator, causes a NoMethodError to be raised. For example, "false <=> 
0" raises in this way. This behavior is unexpected and runs counter to 
the principle defined above.

Further, "0 <=> false" returns nil. This is fundamentally inconsistent. 
The two comparisons are the other's converse, yet the raising of an 
exception in the first case implies that the programmer was in error; 
that the mere act of making this comparison was erroneous.

The solution is for Object to define a <=> operator. This will solve the 
case described above, along with the general case of comparing an object 
to another when one or both do not define <=>. Similarly to Time#<=>, it 
would return the converse of arg <=> self (i.e. nil => nil, num => 
-num). It needs to detect recursion, in which case it should return nil 
or 0 depending on the result of self == arg.

This change would make it always safe to call <=> without having to 
check first if it respond_to? :<=> (or rescuing NoMethodError).

The existence of Object#<=> would make it much easier for all 
programmers to define a good <=> operator for their classes. They can 
simply call super if they don't know how to handle some argument type. 
For example:

class MyClass
  include Comparable
  def <=> (arg)
    return super unless arg.is_a? MyClass
    # go on with comparison
  end
end

With this simple line, the developper has enabled other classes to be 
comparable with MyClass. No need to monkeypatch MyClass to ensure that 
comparing its objects with objects of class ComparableToMyClass will 
work. Without a 'super', implementing this becomes quite difficult and 
requires the use of recursion guards (which are not defined in the 
standard library).

Note that neither String#<=> nor Time#<=> currently use recursion 
guards, which is not robust and can lead to problems. For instance:

class MyClass
  include Comparable
  def <=> (arg)
    return -1 if arg.is_a? MyClass
    cmp = arg <=> self
    cmp ? -cmp : nil
  end
end

MyClass.new <=> Time.now
# ==> raises a SystemStackError

class Time
  alias_method :to_str, :to_s
end
"now" <=> Time.now
# ==> endless loop that can't be interrupted with ctrl-C.


In summary, defining Object#<=> would:
1) bring consistency between a <=> b and b <=> a
2) provide a sensible default (nil) for objects that can't be compared
3) make it easier for generic methods to call <=> (no rescue or 
:respond_to? needed)
4) make it much easier for developpers to write extensible <=> methods 
for their classes.


Side notes:

The proposition stands only for Object. BasicObject would still be 
available for developers preferring a class with a strict minimum of 
methods.

The only code that could break would have to be both checking 
respond_to? :<=> (or rescuing a NoMethodError) _and_ behaving 
differently than if the <=> method had returned nil. Such code would be 
quite nonsensical, given the definition of <=>

Other comparison operators like <, <=, >=, > would also gain in 
consistency if they were defined in terms of <=>. This way, 0 < nil and 
nil > 0 would raise the same errors instead of errors of different 
types. This is secondary to the main question: is it better to define 
Object#<=> or not?
My vote is on 'yes'.

(Thanks to the readers of my first draft)
Posted by Rick DeNatale (Guest)
on 2009-06-28 15:23
(Received via mailing list)
Issue #1697 has been updated by Rick DeNatale.


Well, a lot of users of <=> don't expect a nil result!

class Object
  def <=> other
    nil
  end
end

1 <=> nil     # => nil
nil <=> 1     # => nil
1 <=> false   # => nil
false <=> nil # => nil
nil <=> false # => nil
false <=> nil # => nil
[1, nil, false].sort # =>
# ~> -:15:in `sort': comparison of Fixnum with nil failed 
(ArgumentError)
# ~>   from -:15
----------------------------------------
http://redmine.ruby-lang.org/issues/show/1697
Posted by Eero Saynatkari (roo_p)
on 2009-06-28 20:13
(Received via mailing list)
Excerpts from Luiz Angelo Daros de Luca's message of Sun Jun 28 16:22:45 
+0300 2009:
> Issue #1697 has been updated by Rick DeNatale.
> 
> 
> Well, a lot of users of <=> don't expect a nil result!

For what it is worth, when Marc-Andre brought the topic up
on #rubinius, I argued for it to always either return one
of -1, 0, 1 or then raise an error; and to never return nil
or any other value. The user may simply rescue (or not) and
differentiate between errors as needed.

Additionally, there is no need to check for nil separately
before working on the value.

One of the counterarguments was that errors are used for
"control flow" here, but I do not see it that way. I would,
in my code, probably never expect for #<=> to fail when I
use it (and never guard it with rescue): to me it indicates
a very likely design- or logic problem that should be fixed
anyway. Exceptions, pun intended, of course may exist.


Eero
Posted by Marc-Andre Lafortune (Guest)
on 2009-06-29 14:36
(Received via mailing list)
On Sun, Jun 28, 2009 at 2:12 PM, Eero Saynatkari<ruby-ml@kittensoft.org> 
wrote:
> For what it is worth, when Marc-Andre brought the topic up
> on #rubinius, I argued for it to always either return one
> of -1, 0, 1 or then raise an error; and to never return nil
> or any other value. The user may simply rescue (or not) and
> differentiate between errors as needed.

Arguments can be made as to what would be the best semantic for <=>.
Eero would prefer some kind of NotComparable exception raised. My
personal opinion is that <=> was created with the perfect semantic.

This really is a different and separate discussion though; one could
also argue that  "nil =~ Date.new" should raise an error! Changing the
semantics for <=> at this point would require a really compelling
argument, since all existing Ruby code calling <=> or defining <=> has
been written with the assumption that <=> should return nil when
arguments can't be compared.

My proposition doesn't change the semantics of <=>, rather it makes it
is more respectful of it.


Marc-Andr
Posted by Eero Saynatkari (roo_p)
on 2009-06-29 17:57
(Received via mailing list)
Excerpts from Marc-Andre's message of Mon Jun 29 15:35:52 +0300 2009:
> This really is a different and separate discussion though; one could
> also argue that  "nil =~ Date.new" should raise an error! Changing the
> semantics for <=> at this point would require a really compelling
> argument, since all existing Ruby code calling <=> or defining <=> has
> been written with the assumption that <=> should return nil when
> arguments can't be compared.

I disagree. Because there are currently no unified semantics
for #<=>, you are equally likely to find someone who relies
on a TypeError (or NoMethodError) being raised on an invalid
conversion. I think either change will require some people
to change their code, making the argument moot in terms of
which solution is "better."

(And indirectly, I think e.g. Array checks for a nil return
value and raises if it gets one.)

Is there a compelling case where not being able to compare
two objects is a valid expectation, and could not be better
addressed by actually implementing the comparison or some
other change in logic? On the flipside of that, there seems
to be potential for masking problems by evaluating to nil,
mostly in cases where #<=> is being used implicitly.

> My proposition doesn't change the semantics of <=>, rather it makes it
> is more respectful of it.

See above.


Eero
Posted by Nobuyoshi Nakada (nobu)
on 2009-09-05 00:00
(Received via mailing list)
Hi,

At Sun, 28 Jun 2009 10:48:45 +0900,
Marc-Andre Lafortune wrote in [ruby-core:24063]:
> Other comparison operators like <, <=, >=, > would also gain
> in consistency if they were defined in terms of <=>. This
> way, 0 < nil and nil > 0 would raise the same errors instead
> of errors of different types. This is secondary to the main
> question: is it better to define Object#<=> or not?
> My vote is on 'yes'.

A patch for it.


Index: object.c
===================================================================
--- object.c  (revision 24752)
+++ object.c  (working copy)
@@ -32,5 +32,5 @@ VALUE rb_cTrueClass;
 VALUE rb_cFalseClass;

-static ID id_eq, id_eql, id_match, id_inspect, id_init_copy;
+static ID id_eq, id_eql, id_match, id_inspect, id_init_copy, id_cmp;

 /*
@@ -1116,4 +1116,14 @@ rb_obj_not_match(VALUE obj1, VALUE obj2)
 }

+/* :nodoc: */
+static VALUE
+rb_obj_cmp(VALUE obj, VALUE arg)
+{
+    if (!rb_respond_to(arg, id_cmp)) {
+  rb_notimplement();
+    }
+    return Qnil;
+}
+

 /***********************************************************************
@@ -2518,4 +2528,5 @@ Init_Object(void)
     rb_define_method(rb_mKernel, "=~", rb_obj_match, 1);
     rb_define_method(rb_mKernel, "!~", rb_obj_not_match, 1);
+    rb_define_method(rb_mKernel, "<=>", rb_obj_cmp, 1);
     rb_define_method(rb_mKernel, "eql?", rb_obj_equal, 1);
     rb_define_method(rb_mKernel, "hash", rb_obj_hash, 0);
@@ -2659,4 +2670,5 @@ Init_Object(void)
     id_inspect = rb_intern("inspect");
     id_init_copy = rb_intern("initialize_copy");
+    id_cmp = rb_intern("<=>");

     for (i=0; conv_method_names[i].method; i++) {
Posted by Marc-Andre Lafortune (Guest)
on 2009-10-13 04:41
(Received via mailing list)
Issue #1697 has been updated by Marc-Andre Lafortune.


I would modify slightly Nobu's patch to return 0 if the two compared 
objects are one and the same. I also think it is probably a better 
choice to return nil instead of raising an error.

With this patch in, running 'make test' doesn't generate any error. 
'make test-all' generates two, on for rake because Rake::FileList has a 
weak way to find which methods are proper to Array, and another error 
because of Delegator, which doesn't inherit from BasicObject (see issue 
1333) and doesn't list :<=> in its list of methods to undefine. So both 
of these are trivially fixed.

Although I really would like an implementation of a good double dispatch 
scheme, I'll leave that discussion for another time.


diff --git a/object.c b/object.c
index e81007b..fdf1ee6 100644
--- a/object.c
+++ b/object.c
@@ -1115,13 +1115,23 @@ rb_obj_not_match(VALUE obj1, VALUE obj2)
     return RTEST(result) ? Qfalse : Qtrue;
 }

-/* :nodoc: */
+/*
+ *  call-seq:
+ *     obj <=> other       => -1, 0, 1 or nil
+ *
+ *  Comparison---Returns -1 if <i>obj</i> is smaller than <i>other</i>,
+ *  0 if <i>obj</i> is the same as <i>other</i>, and +1 if <i>obj</i> 
is
+ *  greater than <i>other</i>. Returns <code>nil</code> if <i>obj</i>
+ *  can not be compared with <i>other</i>.
+ *  Typically, this method is overridden in descendent
+ *  classes to provide class-specific meaning.
+ */
+
 static VALUE
 rb_obj_cmp(VALUE obj, VALUE arg)
 {
-    if (!rb_respond_to(arg, id_cmp)) {
-       rb_notimplement();
-    }
+    if (obj == arg)
+       return INT2FIX(0);
     return Qnil;
 }

----------------------------------------
http://redmine.ruby-lang.org/issues/show/1697
Posted by Yukihiro Matsumoto (Guest)
on 2009-10-24 18:57
(Received via mailing list)
Issue #1697 has been updated by Yukihiro Matsumoto.

Status changed from Open to Closed
% Done changed from 0 to 100

This issue was solved with changeset r25453.
Marc-Andre, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.

----------------------------------------
http://redmine.ruby-lang.org/issues/show/1697
Posted by Marc-Andre Lafortune (Guest)
on 2009-10-26 13:14
(Received via mailing list)
Issue #1697 has been updated by Marc-Andre Lafortune.

Status changed from Closed to Open

Hi Matz,

In rb_obj_cmp, the rb_obj_equal is redundant. Did you mean the 
following?

diff --git a/object.c b/object.c
index 9f7537f..8f20621 100644
--- a/object.c
+++ b/object.c
@@ -1123,7 +1123,7 @@ rb_obj_not_match(VALUE obj1, VALUE obj2)
 static VALUE
 rb_obj_cmp(VALUE obj1, VALUE obj2)
 {
-    if (obj1 == obj2 || rb_obj_equal(obj1, obj2))
+    if (rb_equal(obj1, obj2))
        return INT2FIX(0);
     return Qnil;
 }

----------------------------------------
http://redmine.ruby-lang.org/issues/show/1697
Posted by Yukihiro Matsumoto (Guest)
on 2009-10-26 14:28
(Received via mailing list)
Hi,

In message "Re: [ruby-core:26318] [Feature #1697](Open) Object#<=>"
    on Mon, 26 Oct 2009 21:14:08 +0900, Marc-Andre Lafortune 
<redmine@ruby-lang.org> writes:

|In rb_obj_cmp, the rb_obj_equal is redundant. Did you mean the following?

It's a grain of optimization to reduce function call.

              matz.
Posted by Eero Saynatkari (roo_p)
on 2009-10-26 14:49
(Received via mailing list)
Excerpts from Yukihiro Matsumoto's message of Mon Oct 26 15:24:12 +0200 
2009:
> Hi,
> 
> In message "Re: [ruby-core:26318] [Feature #1697](Open) Object#<=>"
>     on Mon, 26 Oct 2009 21:14:08 +0900, Marc-Andre Lafortune <redmine@ruby-lang.org> writes:
> 
> |In rb_obj_cmp, the rb_obj_equal is redundant. Did you mean the following?
> 
> It's a grain of optimization to reduce function call.

rb_obj_equal() is implemented just as ==, though, so it is
redundant unless I am missing something? I would assume that
the intended function was rb_eql() instead?

Marc-Andre's rb_equal() suggestion combines the two, although
it uses id_eq, not id_eql, so I can see where avoiding that
would be slightly better in performance.


Wishing there were fewer equality functions
or at least more variance in their names,
                                            Eero
Posted by Marc-Andre Lafortune (Guest)
on 2009-10-26 17:23
(Received via mailing list)
Issue #1697 has been updated by Marc-Andre Lafortune.


Given your comment, and Eero rightly pointing out that it's better to 
use eql? than == for this case,

diff --git a/object.c b/object.c
index 9f7537f..8f20621 100644
--- a/object.c
+++ b/object.c
@@ -1123,7 +1123,7 @@ rb_obj_not_match(VALUE obj1, VALUE obj2)
 static VALUE
 rb_obj_cmp(VALUE obj1, VALUE obj2)
 {
-    if (obj1 == obj2 || rb_obj_equal(obj1, obj2))
+    if (obj1 == obj2 || rb_eql(obj1, obj2))
       return INT2FIX(0);
    return Qnil;
 }


Note: I'm not saying that obj1 == obj2 is redundant. I'm saying 
rb_obj_equal is.

----------------------------------------
http://redmine.ruby-lang.org/issues/show/1697
Posted by ujihisa . (Guest)
on 2009-12-25 18:22
(Received via mailing list)
Issue #1697 has been updated by ujihisa ..

Status changed from Open to Assigned

As Marc-Andre says, the current following code is redundant.

    static VALUE
    rb_obj_cmp(VALUE obj1, VALUE obj2)
    {
        if (obj1 == obj2 || rb_obj_equal(obj1, obj2))
            return INT2FIX(0);
        return Qnil;
    }

where

    VALUE
    rb_obj_equal(VALUE obj1, VALUE obj2)
    {
        if (obj1 == obj2) return Qtrue;
        return Qfalse;
    }

in short, the original code means

    static VALUE
    rb_obj_cmp(VALUE obj1, VALUE obj2)
    {
        if (obj1 == obj2 || ((obj1 == obj2) ? Qtrue : Qfalse))
            return INT2FIX(0);
        return Qnil;
    }

As Marc-Andre says, there are two alternatives which are better than 
current code. If you prefer to allow Ruby to change the behavior of 
`Object#<=>` by overwriting `Object#==`, apply the patch he showed. 
(Current RubySpec is based on it)

Otherwise, how about applying this patch:

    diff --git a/object.c b/object.c
    index 503a7c5..3dd3290 100644
    --- a/object.c
    +++ b/object.c
    @@ -1131,7 +1131,7 @@ rb_obj_not_match(VALUE obj1, VALUE obj2)
     static VALUE
     rb_obj_cmp(VALUE obj1, VALUE obj2)
     {
    -    if (obj1 == obj2 || rb_obj_equal(obj1, obj2))
    +    if (obj1 == obj2)
              return INT2FIX(0);
         return Qnil;
     }

I'll fix the corresponding RubySpec after you decided which behavior is 
preferable.

----------------------------------------
http://redmine.ruby-lang.org/issues/show/1697
Posted by Yukihiro Matsumoto (Guest)
on 2009-12-25 18:45
(Received via mailing list)
Hi,

In message "Re: [ruby-core:27300] [Feature #1697](Assigned) Object#<=>"
    on Thu, 24 Dec 2009 07:15:13 +0900, "ujihisa ." 
<redmine@ruby-lang.org> writes:
|
|Issue #1697 has been updated by ujihisa ..
|
|Status changed from Open to Assigned
|
|As Marc-Andre says, the current following code is redundant.
|
|    static VALUE
|    rb_obj_cmp(VALUE obj1, VALUE obj2)
|    {
|        if (obj1 == obj2 || rb_obj_equal(obj1, obj2))
|            return INT2FIX(0);
|        return Qnil;
|    }

It should be:

|        if (obj1 == obj2 || rb_equal(obj1, obj2))

              matz.
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.