To_yaml in utf-8 encoding

Hello, can you please help me with this tricky problem:
Is there a way to force utf-8 encoding in to_yaml, ruby 1.9?

require 'yaml'

p "alex".to_yaml # "--- alex\n"
p "Алекс".to_yaml # "---

“\xD0\x90\xD0\xBB\xD0\xB5\xD0\xBA\xD1\x81”\n"

P.S.
Maybe I missing something or don’t understand. I can’t get it, why we
should bother about encoding in todays?

Why there’s such a strange situation with encoding in ruby 1.9? It’s
impossible to set utf-8 as default (only via command-line command)? All
thouse tricky defaults and conversions …

According to TIOBE the Java user base, installation base and community
about 30 times more than Ruby. And it’s the universal language, You can
apply it generally almost to anything, and the only encoding that exist
there - it’s UTF. In 99.9% of cases You use UTF.

So, why the Ruby adds such a complexity? What’s the point, why you may
want ever to go with ANSI-8BIT?

The yaml docs for ruby 1.9.2 do not list a to_yaml() method. And I
get this:

puts RUBY_VERSION
p “alex”.to_yaml

–output:–
1.9.2
prog.rb:2:in <main>': undefined methodto_yaml’ for “alex”:String
(NoMethodError)

To encode all strings in your source file in UTF-8, put this line at the
top of your program:

encoding: utf-8

As for whether yaml will round trip them correctly, I don’t know.

A round trip through yaml works for UTF-8 encoded strings in ruby 1.8.7.

In the program below, my original string contains a euro symbol, and
after a roundtrip through yaml, I get the same string back.

encoding: UTF-8

require ‘yaml’

puts RUBY_VERSION
str = “a Euro symbol: €”

File.open(‘data.db’, ‘w’) do |f|
YAML::dump(str, f)
end

File.open(‘data.db’, ‘r’) do |f|
puts YAML::load(f)
end


contents of data.db:
— “a \xE2\x82\xAC symbol”

File.open(‘data.db’, ‘r’) do |f|
puts YAML::load(f)
end

–output:–
1.8.7
str = “a Euro symbol: €”

Thanks for Your help,

prog.rb:2:in <main>': undefined methodto_yaml’ for “alex”:String

strange, but maybe there are the “require ‘yaml’” statement missing?

To encode all strings in your source file in UTF-8, put this line at the
top of your program:

encoding: utf-8

Yes, I heard of this, You can also use this shortcut to do that:

export RUBYOPT="-Ku -rrubygems"

A round trip through yaml works for UTF-8 encoded strings in ruby 1.8.7.

In the program below, my original string contains a euro symbol, and
after a roundtrip through yaml, I get the same string back.

Sorry, I don’t understand it, You shown that the cryptic output of YAML
can be loaded back to normal object.
But in my situation it’s not enough, I need not ony load it back later
but it also should be readable in it’s marshalled form.

I need to modify thouse *.yaml configs by hand, so those cryptic stuff
like “\xE2\x82\xAC” in *.yaml files is unacceptable.

Strange, Iconv says that ‘ASCII-8BIT’ encoding is uncknown:

require ‘iconv’
Iconv.conv(‘ASCII-8BIT’, ‘UTF-8’,
“\xD0\x90\xD0\xBB\xD0\xB5\xD0\xBA\xD1\x81”)

test.rb:29:in conv': invalid encoding ("ASCII-8BIT", "UTF-8") (Iconv::InvalidEncoding) from test.rb:29:in

solved, here’s the solution:

$ brew install libyaml

require 'psych'
require 'yaml'

As far as I understood the default yaml engine ‘syck’ is broken and
eventually will be replaced by psych in future ruby versions.

7stud – wrote in post #991873:


contents of data.db:
— “a \xE2\x82\xAC symbol”

I changed the string, so the contents of data.db are actually:

— “a Euro symbol: \xE2\x82\xAC”

On Apr 9, 8:57am, Alexey P. [email protected] wrote:


Posted viahttp://www.ruby-forum.com/.

Try ya2yaml

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs