What is String#ord?

Ruby 1.9 docs for String#ord say:

Return the <code>Integer</code> ordinal of a one-character string.

What does that mean? Check for example

"×".ord # => 215
"×".bytes.to_a # => [195, 151]

– fxn

On Sat, Apr 17, 2010 at 5:35 PM, Xavier N. [email protected] wrote:

Ruby 1.9 docs for String#ord say:

  Return the Integer ordinal of a one-character string.

What does that mean? Check for example

  “×”.ord # => 215
  “×”.bytes.to_a # => [195, 151]

Trial and error suggests it is the code of the character in the
encoding of the string:

euro = "\u20AC"

euro.ord.to_s(16) # => "20ac"
euro.encode("iso-8859-15").ord.to_s(16) # => "a4"

That is what the source code suggests also:

VALUE
rb_str_ord(VALUE s)
{
unsigned int c;

c = rb_enc_codepoint(RSTRING_PTR(s), RSTRING_END(s), 

STR_ENC_GET(s));
return UINT2NUM(c);
}

On 17 April 2010 18:41, Xavier N. [email protected] wrote:

VALUE
rb_str_ord(VALUE s)
{
unsigned int c;

c = rb_enc_codepoint(RSTRING_PTR(s), RSTRING_END(s), STR_ENC_GET(s));
return UINT2NUM(c);
}

p “×”.ord # => 215
p “×”.bytes.to_a # => [195, 151]
p “×”.encoding # => #Encoding:UTF-8
p “×”.codepoints.to_a #=> [215]

In UTF-8, (and Unicode in general), one byte is not always(or even
never) a
character.
A codepoint represent a character :wink:

So, you can think of ord as codepoints[0], and that number of course
depends
of the String’s Encoding.

Regards,
B.D.

Yes of course, a posteriori that’s the only thing that makes sense. I
was in a different context and the doc was not clear enough for me.

Perhaps I send a patch to define #ord in terms of the code/codepoint
in the string’s character encoding, instead of that bare “ordinal”.