As near as I can tell, Ruby 1.9 handles Unicode Astral Plane characters
correctly (example: 0x10464.chr(Encoding::UTF_8)), but doesn't provide
any means to directly include such in string constants, not even using
surrogates (popular with JSON).
Python, for example, supports this with an uppercase U followed by 8 hex
characters.
u'\U00010346'
Python also recognizes surrogate characters when decoding utf-8:
u"\ud800\udf46".encode('utf-8').decode('utf-8')
- Sam Ruby
on 28.12.2007 21:46
on 28.12.2007 22:46
Sam,
The syntax you want uses curly braces: \u{10464}. Works for short
codepoints, too: \u{A3}. And even for space-separate sequences of
codepoints: \u{a3 a5 20ac} => pounds, yen, euro
David
on 28.12.2007 23:28
David Flanagan wrote: > Sam, > > The syntax you want uses curly braces: \u{10464}. Works for short > codepoints, too: \u{A3}. And even for space-separate sequences of > codepoints: \u{a3 a5 20ac} => pounds, yen, euro Thanks! - Sam Ruby