As near as I can tell, Ruby 1.9 handles Unicode Astral Plane characters
correctly (example: 0x10464.chr(Encoding::UTF_8)), but doesn't provide
any means to directly include such in string constants, not even using
surrogates (popular with JSON).
Python, for example, supports this with an uppercase U followed by 8 hex
characters.
u'\U00010346'
Python also recognizes surrogate characters when decoding utf-8:
u"\ud800\udf46".encode('utf-8').decode('utf-8')
- Sam Ruby
on 2007-12-28 21:46
on 2007-12-28 22:46
Sam,
The syntax you want uses curly braces: \u{10464}. Works for short
codepoints, too: \u{A3}. And even for space-separate sequences of
codepoints: \u{a3 a5 20ac} => pounds, yen, euro
David
on 2007-12-28 23:28
David Flanagan wrote: > Sam, > > The syntax you want uses curly braces: \u{10464}. Works for short > codepoints, too: \u{A3}. And even for space-separate sequences of > codepoints: \u{a3 a5 20ac} => pounds, yen, euro Thanks! - Sam Ruby
Please log in before posting. Registration is free and takes only a minute.
Existing account
(Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
Log in with Google account | Log in with Yahoo account
No account? Register here.