Encoding woes with command prompt output

Im working on a script which is going to be printing some non-ascii
characters (æøå), and for the life of me, I just cant seem to make it
print in the Windows Command Prompt terminal! I think I have done what
should be enough to make it work, defining the encoding in my ruby file,
but it just isnt working. However, if I try to print the same
characters in irb, it works just fine. I can make it work by injecting
hex values in my strings, but Id rather not have to do that, as that
code isnt very readable. Im at a loss here, and would be grateful if
anyone can help me out with my encoding woes!

OS: Windows XP 32 bit SP3

C:>ruby -v
ruby 1.9.2p180 (2011-02-18) [i386-mingw32]

Content in file test.rb:

Encoding: CP850

puts “æøå”
puts “\x91\x9B\x86”
puts Encoding.default_external

C:>ruby test.rb
├ª├©├Ñ
æøå
CP850

C:>irb
irb(main):001:0> puts “æøå”
æøå
=> nil
irb(main):002:0> puts “\x91\x9B\x86”
æøå
=> nil
irb(main):003:0> puts Encoding.default_external
CP850

Regards,
Chris

Hi,

2011/12/15 Chris L. [email protected]

OS: Windows XP 32 bit SP3

It seems that the actual encoding of test.rb is not CP850 but UTF-8.

The string “æøå” is encoded as “\xC3\xA6\xC3\xB8\xC3\xA5” in UTF-8.

C:>ruby test.rb
├ª├©├Ñ
æøå
CP850

The string “├ª├©├Ñ” is “\xC3\xA6\xC3\xB8\xC3\xA5” in CP850.

Regards,
Chris

Regards,
Park H.

Heesob P. wrote in post #1036844:

It seems that the actual encoding of test.rb is not CP850 but UTF-8.

The string “æøå” is encoded as “\xC3\xA6\xC3\xB8\xC3\xA5” in UTF-8.

Well, I’ve tried marking the file with # Encoding: UTF-8 as well, but it
still doesnt help me getting my “æøå” string printed properly to the
screen. So my problem remains, how do I need to configure this so that a
string like “æøå” in my Ruby file gets printed, and not having to use
the ‘ugly’ hex injections?

Thanks,
Chris

Chris L. wrote in post #1036849:

Heesob P. wrote in post #1036844:

It seems that the actual encoding of test.rb is not CP850 but UTF-8.

The string “æøå” is encoded as “\xC3\xA6\xC3\xB8\xC3\xA5” in UTF-8.

Well, I’ve tried marking the file with # Encoding: UTF-8 as well, but it
still doesnt help me getting my “æøå” string printed properly to the
screen. So my problem remains, how do I need to configure this so that a
string like “æøå” in my Ruby file gets printed, and not having to use
the ‘ugly’ hex injections?

Did you check your console is actually using TrueType fonts?

Have you tried setting the codepage to Unicode? (chcp 65001)


Luis L.

Luis L. wrote in post #1036857:

Did you check your console is actually using TrueType fonts?

Have you tried setting the codepage to Unicode? (chcp 65001)


Luis L.

Thanks for the suggestion. I changed to unicode as suggested (chcp
65001). Also I changed my font in the Command Prompt to Lucida Console.

This seems to…almost work. However, my Ruby hangs! :frowning:

Content in file test.rb:

Encoding: CP65001

puts “æøå”

C:>ruby test.rb
æøååå

And there it apparently hangs indefinitely! I have to press Ctrl+C to
abort it:
æøåååtest.rb:2:in write': Interrupt from test.rb:2:inputs’
from test.rb:2:in puts' from test.rb:2:in

Curiously, I am also not able to access irb when using CP 65001:

C:>irb

C:>
C:>chcp 850
Aktiv tegntabell: 850

C:>irb
irb(main):001:0> puts “æøå”
æøå
=> nil
irb(main):002:0> exit

It’s not that big a deal for me, I can still manage, but I cant help but
feel annoyed not getting the encoding working right.

Regards,
Chris