Ruby 1.9.3: strings with Portuguese names are always different

Hello!

I’m learning the Ruby language but I have a problem: strings with
Portuguese names are always different.

Example:


encoding: utf-8

puts “Name?”
STDOUT.flush
name = gets.chomp.to_s
puts “Hello João!” if name == “João”


But “name” is always different from “João”. The program only works if I
change it to a version without the character ã.

So I tried to compare the number of bytes and ASCII codes of the two
strings and I concluded this:

-The character “ã” obtained by “gets” has the ASCII code 198 (one
byte);
-The character “ã” from the string “João” (last line) has the ASCII code
195 and 163 (two bytes);
-The two strings when printed with “puts” look the same but when
compared with “if” they are different.

How can I solve this? Thanks.

I think that the issue comes from your console. What OS are u using? If
it’s Windows, check command “CHCP”.

I run the “CHCP” command under Windows 7 and I got “Active code page:
850”.

Yes, you have to roll with the external and internal encoding of your
script. You have to learn more about it.

Thanks. For now I solved the problema with this line of code:

name = gets.chomp.to_s.encode(“utf-8”)

With this line of code, name is no more encoded in CP850 and the program
runs well.

That’s great.