Sean O’halpin wrote in post #1061359:
On Fri, May 18, 2012 at 7:31 PM, Mariano Jos G. [email protected]
wrote:
encoding: utf-8
chcp 852
#change cmd encoding to unicode
Codepage 852 isn’t the Unicode codepage - it’s MSDOS Latin-2 which
isn’t even ISO 8859-2 - see
Code page 852 - Wikipedia.
According to
Code Page Identifiers - Win32 apps | Microsoft Learn
the codepage for UTF-8 is 65001.
You should be able to set the codepage with
chcp 65001
then just output your UTF-8 strings without having to convert them as
long as you have the
encoding: utf-8
line near the top of your script.
I’m afraid I can’t test this at the moment as I don’t have access to a
Windows machine.
First, uninstall 1.9.2 and install a recent 1.9.3 (p125 or later) from
http://rubyinstaller.org/ If you’re using the 1.9 family on Windows,
purge every other version except 1.9.3p125 or higher.
Here’s what I get on Win7 32bit in a cmd.exe shell…
C:\Users\Jon\Documents\RubyDev\sandbox>chcp
Active code page: 437
*** encoding_1.rb file contents ***
encoding: UTF-8
utf8 = “Some accented text áóíúé with regular text.”
puts utf8
C:\Users\Jon\Documents\RubyDev\sandbox>pik ruby encoding_1.rb
jruby 1.6.7.2 (ruby-1.9.2-p312) (2012-05-01 26e08ba) (Java HotSpot™
Client VM 1.7.0_04) [Windows 7-x86-java]
Some accented text áóíúé with regular text.
ruby 1.8.7 (2012-02-08 patchlevel 358) [i386-mingw32]
Some accented text áóíúé with regular text.
ruby 1.9.3p125 (2012-02-16) [i386-mingw32]
Some accented text áóíúé with regular text.
ruby 1.9.3p223 (2012-05-19 revision 35717) [i386-mingw32]
Some accented text áóíúé with regular text.
tcs-ruby 1.9.3p196 (2012-04-21, TCS patched 2012-04-21) [i386-mingw32]
Some accented text áóíúé with regular text.
ruby 2.0.0dev (2012-05-21 trunk 35732) [i386-mingw32]
Some accented text áóíúé with regular text.
…and without the # encoding: UTF-8
at the top of the file:
C:\Users\Jon\Documents\RubyDev\sandbox>pik ruby encoding_1.rb
jruby 1.6.7.2 (ruby-1.9.2-p312) (2012-05-01 26e08ba) (Java HotSpot™
Client VM 1.7.0_04) [Windows 7-x86-java]
SyntaxError: encoding_1.rb:1: invalid multibyte char (US-ASCII)
ruby 1.8.7 (2012-02-08 patchlevel 358) [i386-mingw32]
Some accented text áóíúé with regular text.
ruby 1.9.3p125 (2012-02-16) [i386-mingw32]
encoding_1.rb:1: invalid multibyte char (US-ASCII)
encoding_1.rb:1: invalid multibyte char (US-ASCII)
ruby 1.9.3p223 (2012-05-19 revision 35717) [i386-mingw32]
encoding_1.rb:1: invalid multibyte char (US-ASCII)
encoding_1.rb:1: invalid multibyte char (US-ASCII)
tcs-ruby 1.9.3p196 (2012-04-21, TCS patched 2012-04-21) [i386-mingw32]
encoding_1.rb:1: invalid multibyte char (US-ASCII)
encoding_1.rb:1: invalid multibyte char (US-ASCII)
ruby 2.0.0dev (2012-05-21 trunk 35732) [i386-mingw32]
encoding_1.rb:1: invalid multibyte char (US-ASCII)
encoding_1.rb:1: invalid multibyte char (US-ASCII)
If you want toy with poor old cmd.exe, try using the type
(like cat
)
command to list out encoding_1.rb
after switching different codepages.
Jon