Forum: Ruby-core [ruby-trunk - Bug #7342][Open] String#<=> checks for a #to_str method on other but never uses it?

Posted by Joshua Ballanco (jballanc)
on 2012-11-13 03:24
(Received via mailing list)
Issue #7342 has been reported by jballanc (Joshua Ballanco).

----------------------------------------
Bug #7342: String#<=> checks for a #to_str method on other but never 
uses it?
https://bugs.ruby-lang.org/issues/7342

Author: jballanc (Joshua Ballanco)
Status: Open
Priority: Normal
Assignee:
Category:
Target version:
ruby -v: 2.0.0


=begin
This isn't exactly a bug, as much as a request for clarification. I was 
looking at the semantics of the (({<=>})) operator and noticed something 
curious. For most classes, when evaluating (({thing <=> other})), if 
(({other})) is not of a compatible type, then (({nil})) is returned.

The exceptions (as far as I can find) are String and Time. For the Time 
class, if (({other})) is not a kind of (({Time})), then the reverse 
comparison (({other <=> thing})) is tried and the inverse of this result 
is returned (if not nil). For String, the reverse comparison is only 
tried IF (({other.respond_to?(:to_str)})), HOWEVER the referenced 
(({other.to_str})) method is never called. For example:

    class NotAString
      def <=>(other)
        1
      end
      def to_str
        raise "I'm not a string!"
      end
    end

    "test" <=> NotAString.new #=> -1

This seems very counterintuitive to me. I would expect that if my class 
implemented (({to_str})), that the return value of this would be used 
for comparison.
=end
Posted by Joshua Ballanco (jballanc)
on 2012-11-14 03:53
(Received via mailing list)
Issue #7342 has been updated by jballanc (Joshua Ballanco).


I would expect something like the following patch makes more sense?

diff --git a/string.c b/string.c
index c63f59a..c9eed27 100644
--- a/string.c
+++ b/string.c
@@ -2385,8 +2385,12 @@ rb_str_cmp_m(VALUE str1, VALUE str2)
     long result;

     if (!RB_TYPE_P(str2, T_STRING)) {
-  if (!rb_respond_to(str2, rb_intern("to_str"))) {
-      return Qnil;
+  if (rb_respond_to(str2, rb_intern("to_str"))) {
+            VALUE tmp = rb_funcall(str2, rb_intern("to_str"), 0);
+            if (!RB_TYPE_P(tmp, T_STRING)) {
+              return Qnil;
+            }
+            result = rb_str_cmp(str1, tmp);
   }
   else if (!rb_respond_to(str2, rb_intern("<=>"))) {
       return Qnil;

----------------------------------------
Bug #7342: String#<=> checks for a #to_str method on other but never 
uses it?
https://bugs.ruby-lang.org/issues/7342#change-32889

Author: jballanc (Joshua Ballanco)
Status: Open
Priority: Normal
Assignee:
Category:
Target version:
ruby -v: 2.0.0


=begin
This isn't exactly a bug, as much as a request for clarification. I was 
looking at the semantics of the (({<=>})) operator and noticed something 
curious. For most classes, when evaluating (({thing <=> other})), if 
(({other})) is not of a compatible type, then (({nil})) is returned.

The exceptions (as far as I can find) are String and Time. For the Time 
class, if (({other})) is not a kind of (({Time})), then the reverse 
comparison (({other <=> thing})) is tried and the inverse of this result 
is returned (if not nil). For String, the reverse comparison is only 
tried IF (({other.respond_to?(:to_str)})), HOWEVER the referenced 
(({other.to_str})) method is never called. For example:

    class NotAString
      def <=>(other)
        1
      end
      def to_str
        raise "I'm not a string!"
      end
    end

    "test" <=> NotAString.new #=> -1

This seems very counterintuitive to me. I would expect that if my class 
implemented (({to_str})), that the return value of this would be used 
for comparison.
=end
Posted by bitsweat (Jeremy Kemper) (Guest)
on 2012-11-14 04:40
(Received via mailing list)
Issue #7342 has been updated by bitsweat (Jeremy Kemper).


When an object responds to #to_str, it's saying "I am a string." When an 
object responds to #to_s, it's saying "I have a string representation."

So checking for #to_str here is enough to know whether str2 is a string 
and can be compared.

For more background on implicit vs explicit coercion in Ruby: 
http://briancarper.net/blog/98/
----------------------------------------
Bug #7342: String#<=> checks for a #to_str method on other but never 
uses it?
https://bugs.ruby-lang.org/issues/7342#change-32891

Author: jballanc (Joshua Ballanco)
Status: Open
Priority: Normal
Assignee:
Category:
Target version:
ruby -v: 2.0.0


=begin
This isn't exactly a bug, as much as a request for clarification. I was 
looking at the semantics of the (({<=>})) operator and noticed something 
curious. For most classes, when evaluating (({thing <=> other})), if 
(({other})) is not of a compatible type, then (({nil})) is returned.

The exceptions (as far as I can find) are String and Time. For the Time 
class, if (({other})) is not a kind of (({Time})), then the reverse 
comparison (({other <=> thing})) is tried and the inverse of this result 
is returned (if not nil). For String, the reverse comparison is only 
tried IF (({other.respond_to?(:to_str)})), HOWEVER the referenced 
(({other.to_str})) method is never called. For example:

    class NotAString
      def <=>(other)
        1
      end
      def to_str
        raise "I'm not a string!"
      end
    end

    "test" <=> NotAString.new #=> -1

This seems very counterintuitive to me. I would expect that if my class 
implemented (({to_str})), that the return value of this would be used 
for comparison.
=end
Posted by Joshua Ballanco (jballanc)
on 2012-11-14 10:35
(Received via mailing list)
Issue #7342 has been updated by jballanc (Joshua Ballanco).


As the page you linked points out, #to_str is an *implicit* cast. i.e. 
It should be used internally to retrieve the string representation of an 
object. I think this is in keeping with all other uses of #to_str in 
Ruby source.

Another thing to note is that currently in Ruby if you have an object 
that provides #to_str but *not* #<=>, then it **cannot** be compared to 
a string object.

class Foo
  def to_str
    "my string"
  end
end

"test" < Foo.new #=> ArgumentError: comparison of String with Foo failed
----------------------------------------
Bug #7342: String#<=> checks for a #to_str method on other but never 
uses it?
https://bugs.ruby-lang.org/issues/7342#change-32897

Author: jballanc (Joshua Ballanco)
Status: Open
Priority: Normal
Assignee:
Category:
Target version:
ruby -v: 2.0.0


=begin
This isn't exactly a bug, as much as a request for clarification. I was 
looking at the semantics of the (({<=>})) operator and noticed something 
curious. For most classes, when evaluating (({thing <=> other})), if 
(({other})) is not of a compatible type, then (({nil})) is returned.

The exceptions (as far as I can find) are String and Time. For the Time 
class, if (({other})) is not a kind of (({Time})), then the reverse 
comparison (({other <=> thing})) is tried and the inverse of this result 
is returned (if not nil). For String, the reverse comparison is only 
tried IF (({other.respond_to?(:to_str)})), HOWEVER the referenced 
(({other.to_str})) method is never called. For example:

    class NotAString
      def <=>(other)
        1
      end
      def to_str
        raise "I'm not a string!"
      end
    end

    "test" <=> NotAString.new #=> -1

This seems very counterintuitive to me. I would expect that if my class 
implemented (({to_str})), that the return value of this would be used 
for comparison.
=end
Posted by bitsweat (Jeremy Kemper) (Guest)
on 2012-11-14 23:57
(Received via mailing list)
Issue #7342 has been updated by bitsweat (Jeremy Kemper).


"It should be used internally to retrieve the string representation of 
an object." That's explicit coercion. Implicit coercion with #to_str 
means the object acts a string and the method needn't be called.

This method is used for more than its return value. It's in a strange 
limbo world between Ruby and the C API :)

The presence of #to_str indicates that the object obeys an entire String 
contract such that the C API can work with the object without making 
Ruby method calls. You note correctly that providing #to_str but not 
#<=> prohibits comparison. That's because by omitting #<=> you've 
already broken the "I am a string" contract.

Check out how time.c for another example of checking #to_str and, more 
generally, see rb_check_convert_type for many other examples of implicit 
coercion in practice: to_path, to_int, to_ary, etc.
----------------------------------------
Bug #7342: String#<=> checks for a #to_str method on other but never 
uses it?
https://bugs.ruby-lang.org/issues/7342#change-32905

Author: jballanc (Joshua Ballanco)
Status: Open
Priority: Normal
Assignee:
Category:
Target version:
ruby -v: 2.0.0


=begin
This isn't exactly a bug, as much as a request for clarification. I was 
looking at the semantics of the (({<=>})) operator and noticed something 
curious. For most classes, when evaluating (({thing <=> other})), if 
(({other})) is not of a compatible type, then (({nil})) is returned.

The exceptions (as far as I can find) are String and Time. For the Time 
class, if (({other})) is not a kind of (({Time})), then the reverse 
comparison (({other <=> thing})) is tried and the inverse of this result 
is returned (if not nil). For String, the reverse comparison is only 
tried IF (({other.respond_to?(:to_str)})), HOWEVER the referenced 
(({other.to_str})) method is never called. For example:

    class NotAString
      def <=>(other)
        1
      end
      def to_str
        raise "I'm not a string!"
      end
    end

    "test" <=> NotAString.new #=> -1

This seems very counterintuitive to me. I would expect that if my class 
implemented (({to_str})), that the return value of this would be used 
for comparison.
=end
Posted by Nobuyoshi Nakada (nobu)
on 2012-11-20 06:02
(Received via mailing list)
Issue #7342 has been updated by nobu (Nobuyoshi Nakada).


jballanc (Joshua Ballanco) wrote:
> I would expect something like the following patch makes more sense?

You can use rb_check_funcall().
----------------------------------------
Bug #7342: String#<=> checks for a #to_str method on other but never 
uses it?
https://bugs.ruby-lang.org/issues/7342#change-33158

Author: jballanc (Joshua Ballanco)
Status: Open
Priority: Normal
Assignee:
Category:
Target version:
ruby -v: 2.0.0


=begin
This isn't exactly a bug, as much as a request for clarification. I was 
looking at the semantics of the (({<=>})) operator and noticed something 
curious. For most classes, when evaluating (({thing <=> other})), if 
(({other})) is not of a compatible type, then (({nil})) is returned.

The exceptions (as far as I can find) are String and Time. For the Time 
class, if (({other})) is not a kind of (({Time})), then the reverse 
comparison (({other <=> thing})) is tried and the inverse of this result 
is returned (if not nil). For String, the reverse comparison is only 
tried IF (({other.respond_to?(:to_str)})), HOWEVER the referenced 
(({other.to_str})) method is never called. For example:

    class NotAString
      def <=>(other)
        1
      end
      def to_str
        raise "I'm not a string!"
      end
    end

    "test" <=> NotAString.new #=> -1

This seems very counterintuitive to me. I would expect that if my class 
implemented (({to_str})), that the return value of this would be used 
for comparison.
=end
Posted by Joshua Ballanco (jballanc)
on 2012-11-21 11:54
(Received via mailing list)
Issue #7342 has been updated by jballanc (Joshua Ballanco).

File string_cmp.diff added

> The presence of #to_str indicates that the object obeys an entire String 
contract such that the C API can work with the object without making Ruby method 
calls.

Hmm... I was always under the impression that the distinction between 
#to_s and #to_str is that #to_s provides a (potentially lossy) string 
representation of any object, but #to_str will return a "string 
equivalent" of the object. As for the C API, the `rb_str_to_str` method 
does call #to_str if v#to_str exists and v is not already a string. I 
guess it would be good to get some clarification on this issue.

> You can use rb_check_funcall().

Thank you for the pointer, nobu! Actually, in looking at the 
implementation of String#<=> again I found some other oddities. For 
example, if Other#to_str is defined and Other#<=> returns a float, then 
`"a string" <=> Other.new` will return a float. I feel like this breaks 
the contract of #<=> as it should only ever return 1, 0, or -1. Anyhow, 
I've attached an updated patch that also includes some test fixes.

(Note: all tests in `make test-all` that passed before this patch pass 
after, however rubyspec will need to be updated. I will send a 
pull-request directly to the rubyspec project if this gets accepted.)
----------------------------------------
Bug #7342: String#<=> checks for a #to_str method on other but never 
uses it?
https://bugs.ruby-lang.org/issues/7342#change-33386

Author: jballanc (Joshua Ballanco)
Status: Open
Priority: Normal
Assignee:
Category:
Target version:
ruby -v: 2.0.0


=begin
This isn't exactly a bug, as much as a request for clarification. I was 
looking at the semantics of the (({<=>})) operator and noticed something 
curious. For most classes, when evaluating (({thing <=> other})), if 
(({other})) is not of a compatible type, then (({nil})) is returned.

The exceptions (as far as I can find) are String and Time. For the Time 
class, if (({other})) is not a kind of (({Time})), then the reverse 
comparison (({other <=> thing})) is tried and the inverse of this result 
is returned (if not nil). For String, the reverse comparison is only 
tried IF (({other.respond_to?(:to_str)})), HOWEVER the referenced 
(({other.to_str})) method is never called. For example:

    class NotAString
      def <=>(other)
        1
      end
      def to_str
        raise "I'm not a string!"
      end
    end

    "test" <=> NotAString.new #=> -1

This seems very counterintuitive to me. I would expect that if my class 
implemented (({to_str})), that the return value of this would be used 
for comparison.
=end
Posted by mame (Yusuke Endoh) (Guest)
on 2012-11-24 09:43
(Received via mailing list)
Issue #7342 has been updated by mame (Yusuke Endoh).

Status changed from Open to Assigned
Assignee set to nobu (Nobuyoshi Nakada)
Target version set to 2.0.0


----------------------------------------
Bug #7342: String#<=> checks for a #to_str method on other but never 
uses it?
https://bugs.ruby-lang.org/issues/7342#change-33800

Author: jballanc (Joshua Ballanco)
Status: Assigned
Priority: Normal
Assignee: nobu (Nobuyoshi Nakada)
Category:
Target version: 2.0.0
ruby -v: 2.0.0


=begin
This isn't exactly a bug, as much as a request for clarification. I was 
looking at the semantics of the (({<=>})) operator and noticed something 
curious. For most classes, when evaluating (({thing <=> other})), if 
(({other})) is not of a compatible type, then (({nil})) is returned.

The exceptions (as far as I can find) are String and Time. For the Time 
class, if (({other})) is not a kind of (({Time})), then the reverse 
comparison (({other <=> thing})) is tried and the inverse of this result 
is returned (if not nil). For String, the reverse comparison is only 
tried IF (({other.respond_to?(:to_str)})), HOWEVER the referenced 
(({other.to_str})) method is never called. For example:

    class NotAString
      def <=>(other)
        1
      end
      def to_str
        raise "I'm not a string!"
      end
    end

    "test" <=> NotAString.new #=> -1

This seems very counterintuitive to me. I would expect that if my class 
implemented (({to_str})), that the return value of this would be used 
for comparison.
=end
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.