From UTF-8 to windows-1252

Hello.

I have some data in a file with windows-1252 charset (“special”
characters, for example accented words). I use the method encode to post
them in a SQLite3 DB:

mydata.encode(“utf-8”)

Using SQLiteSpy I can see the data with the right characters.

But when I get the data from the DB with my program I want to process
them in Windows-1252 again. So, if I use encode with windows-1252 I get
an error

mydata.encode(“windows-1252”)

compare_synonyms.rb:21:in encode': "\xC3" from ASCII-8BIT to UTF-8 in conversion from ASCII-8BIT to Windows-1252 (Encoding::UndefinedConversionError) from compare_synonyms.rb:21:inblock (2 levels) in identify_synonyms’

Now, if I use codepoints the data are not displayed with the the
right characters:

mydata.codepoints.to_a.pack(“C*”)

acompañar

What happen? What can I do?

Thanks in advanced.

Hello,

But when I get the data from the DB with my program I want to process
them in Windows-1252 again. So, if I use encode with windows-1252 I get
an error

mydata.encode(“windows-1252”)

Although an encoding of the data from the DB is UTF-8, ruby doesn’t
know the encoding, so you must do tell ruby the encoding before
encoding to Windows-1252.

tell ruby the encoding

mydata.force_encoding( “UTF-8” )

encode to windows-1252

mydata.encode( “windows-1252” )

Regards,

Although an encoding of the data from the DB is UTF-8, ruby doesn’t
know the encoding, so you must do tell ruby the encoding before
encoding to Windows-1252.

tell ruby the encoding

mydata.force_encoding( “UTF-8” )

encode to windows-1252

mydata.encode( “windows-1252” )

Regards,

Hey, thanks a lot!! Now I can see the right characters =D

Regards.

Hi,

In message “Re: From UTF-8 to windows-1252”
on Fri, 7 Jan 2011 03:53:26 +0900, “Y. NOBUOKA”
[email protected] writes:

|Although an encoding of the data from the DB is UTF-8, ruby doesn’t
|know the encoding, so you must do tell ruby the encoding before
|encoding to Windows-1252.
|
| # tell ruby the encoding
| mydata.force_encoding( “UTF-8” )
| # encode to windows-1252
| mydata.encode( “windows-1252” )

For the record, you don’t have to use force_encoding:

mydata.encode(“windows-1252”, “UTF-8”)

          matz.

Hi, matz

mydata.encode(“windows-1252”, “UTF-8”)

I missed the +src_encoding+ arg and the +option+ arg.
Now I see a String#encode method is a very useful.
http://www.ruby-doc.org/core/classes/String.html#M001113

thanks!