Ruby 1.9 docs for String#ord say:
Return the <code>Integer</code> ordinal of a one-character string.
What does that mean? Check for example
"×".ord # => 215
"×".bytes.to_a # => [195, 151]
– fxn
Ruby 1.9 docs for String#ord say:
Return the <code>Integer</code> ordinal of a one-character string.
What does that mean? Check for example
"×".ord # => 215
"×".bytes.to_a # => [195, 151]
– fxn
On Sat, Apr 17, 2010 at 5:35 PM, Xavier N. [email protected] wrote:
Ruby 1.9 docs for String#ord say:
  Return the
Integer
ordinal of a one-character string.What does that mean? Check for example
  “×”.ord # => 215
  “×”.bytes.to_a # => [195, 151]
Trial and error suggests it is the code of the character in the
encoding of the string:
euro = "\u20AC"
euro.ord.to_s(16) # => "20ac"
euro.encode("iso-8859-15").ord.to_s(16) # => "a4"
That is what the source code suggests also:
VALUE
rb_str_ord(VALUE s)
{
unsigned int c;
c = rb_enc_codepoint(RSTRING_PTR(s), RSTRING_END(s),
STR_ENC_GET(s));
return UINT2NUM(c);
}
On 17 April 2010 18:41, Xavier N. [email protected] wrote:
VALUE
rb_str_ord(VALUE s)
{
unsigned int c;c = rb_enc_codepoint(RSTRING_PTR(s), RSTRING_END(s), STR_ENC_GET(s));
return UINT2NUM(c);
}
p “×”.ord # => 215
p “×”.bytes.to_a # => [195, 151]
p “×”.encoding # => #Encoding:UTF-8
p “×”.codepoints.to_a #=> [215]
In UTF-8, (and Unicode in general), one byte is not always(or even
never) a
character.
A codepoint represent a character
So, you can think of ord as codepoints[0], and that number of course
depends
of the String’s Encoding.
Regards,
B.D.
Yes of course, a posteriori that’s the only thing that makes sense. I
was in a different context and the doc was not clear enough for me.
Perhaps I send a patch to define #ord in terms of the code/codepoint
in the string’s character encoding, instead of that bare “ordinal”.
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.
Sponsor our Newsletter | Privacy Policy | Terms of Service | Remote Ruby Jobs