Saving UTF-8 characters to YAML problem

Hi there,

I’m trying to save multinational data to YAML with following simple
program:

require ‘yaml’

show “James B. 007: Nightfire in Chinese”

text in YAML form

puts ‘詹姆斯邦德007:暗夜之火’.to_yaml

and what I get is:
— “\xE8\xA9\xB9\xE5\xA7\x86\xE6\x96\xAF\xE9\x82\xA6\xE5\xBE
\xB7007\xEF\xBC\x9A\xE6\x9A\x97\xE5\xA4\x9C\xE4\xB9\x8B\xE7\x81\xAB”

How can I make this to be human readable text?

Any help much appreciated.
Thanks in advance!

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Paweł Radecki wrote:
| Hi there,
|
| I’m trying to save multinational data to YAML with following simple
| program:
|
| require ‘yaml’
|
| # show “James B. 007: Nightfire in Chinese”
| # text in YAML form
| puts ‘詹姆斯邦德007:暗夜之火’.to_yaml
|
| and what I get is:
| — “\xE8\xA9\xB9\xE5\xA7\x86\xE6\x96\xAF\xE9\x82\xA6\xE5\xBE
| \xB7007\xEF\xBC\x9A\xE6\x9A\x97\xE5\xA4\x9C\xE4\xB9\x8B\xE7\x81\xAB”
|
| How can I make this to be human readable text?

With an editor that understand UTF-8, I guess (considering that you use
characters I know from Polish, you probably already do). The OS has to
support UTF-8, too (I think that Windows does, and so should Mac OS X.
I’m not sure about other *NIX flavors).

If you mean Ruby, Iconv and Kconv are the way to go, AFAIK, to convert
strings between character sets.


Phillip G.
Twitter: twitter.com/cynicalryan

Rule of Open-Source Programming #8:

Open-Source is not a panacea.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkf6pooACgkQbtAgaoJTgL/CCQCgo5vT8Jl0mynvcKRJn/36jNuc
XkkAn0RLEwKcyMv/A7BN6i0krgx++td7
=Umd0
-----END PGP SIGNATURE-----

On Tue, Apr 8, 2008 at 12:25 AM, Paweł Radecki
[email protected] wrote:

and what I get is:
— “\xE8\xA9\xB9\xE5\xA7\x86\xE6\x96\xAF\xE9\x82\xA6\xE5\xBE
\xB7007\xEF\xBC\x9A\xE6\x9A\x97\xE5\xA4\x9C\xE4\xB9\x8B\xE7\x81\xAB”

How can I make this to be human readable text?

install the ya2yaml gem and use “詹姆斯邦德007:暗夜之火”.ya2yaml instead of
.to_yaml
The problem with the yaml that comes with ruby is that it doesn’t have
good unicode support, so any “strange” text will be saved as binary,
that behaviour exists since around 1.8.5.
You don’t have to change the way your YAML is read, but you have to
use ya2yaml for serializing.

^ manveru

Pawe©© Radecki wrote:

#!/usr/bin/env ruby
and ran it both on my Windows and Linux boxes.
(LoadError) from ./yaml_test.rb:4"

Any clue how to work around this?
Any help much appreciated!


Pawe©© Radecki
e: [email protected]
w: http://radeckimarch.blogspot.com/

Use

require ‘rubygems’

before

require ‘ya2yaml’

-Justin

Use

require ‘rubygems’

before

require ‘ya2yaml’

-Justin

It worked. Thanks, guys! You saved me hours!

install the ya2yaml gem and use “ôÚÙµÞÙÛÀÓì007£ºäÞå¨ñýûý”.ya2yaml instead of .to_yaml

I really hoped this worked but it didn’t for 100%.

Here is what I did:
I installed ya2yaml gem using “gem install ya2yaml” command (Windows
box) and “sudo gem install ya2yaml” Linux box. Got: ya2yaml-0.26.

Then I modified my simple program to be:
#!/usr/bin/env ruby

require ‘yaml’
require ‘ya2yaml’
require ‘jcode’
$KCODE = ‘u’

show “James B. 007: Nightfire” text

in Chinese in YAML form

puts ‘ôÚÙµÞÙÛÀÓì007£ºäÞå¨ñýûý’.ya2yaml

and ran it both on my Windows and Linux boxes.

Windows: When I redirect results to a file (running “yaml_test.rb >
test.yaml”) it works perfectly but when I try to display results on a
screen (running “yaml_test.rb”) with 65001 active code page (set
through “chcp 65001” in command line) I see:
(squares)007(squares) and an error message: “in ‘write’ Bad file
descriptor (Errno::EBADF)”.

Linux: While running above program I get:
“./yaml_test.rb:4:in `require’: no such file to load – ya2yaml
(LoadError) from ./yaml_test.rb:4”

Any clue how to work around this?
Any help much appreciated!

Here’s a patch for “rake extract_fixtures” that also uses ya2yaml:
http://fukamachi.org/wp/2007/05/18/rails-dump-database-to-fixtures-preserving-utf8/
Saving YAML as UTF-8 should really be default behavior. UnicodeDammit!