Forum: Ruby US-ASCII to UTF-8

Posted by Matt Beedle (mattb)
on 2010-03-09 13:50
I'm having trouble in with us-ascii strings. I have written a script
which receives emails straight from postfix.  The emails are generally
encoded in "iso-8859-1", and I can't get all of the characters to
display properly, and I can't save to my mongodb database, either. Here
is an example problem string:

In irb by default it displays like this:

Gesch�ftsf�hrer

If I set $KCODE = 'iso-8859-1' then it gets a bit better:

Gesch\344ftsf\374hrer

But how do I now make that into the correct string:

Geschäftsführer

I have tried Iconv:

Iconv.iconv('iso-8859-1', 'utf-8', string)

Iconv::IllegalSequence: "\344ftsf\374hrer"

Please help!
Posted by Brian Candler (candlerb)
on 2010-03-09 14:07
Matt Beedle wrote:
> Iconv.iconv('iso-8859-1', 'utf-8', string)
> 
> Iconv::IllegalSequence: "\344ftsf\374hrer"

You just got the args the wrong way round. 'to' comes before 'from'.
(ri Iconv.iconv)

>> RUBY_VERSION
=> "1.8.7"
>> require 'iconv'
=> true
>> string = "Gesch\344ftsf\374hrer"
=> "Gesch\344ftsf\374hrer"
>> puts Iconv.iconv('utf-8', 'iso-8859-1', string).first
Geschäftsführer
Posted by Matt Beedle (mattb)
on 2010-03-09 14:19
Brian Candler wrote:
> Matt Beedle wrote:
>> Iconv.iconv('iso-8859-1', 'utf-8', string)
>> 
>> Iconv::IllegalSequence: "\344ftsf\374hrer"
> 
> You just got the args the wrong way round. 'to' comes before 'from'.
> (ri Iconv.iconv)
> 
>>> RUBY_VERSION
> => "1.8.7"
>>> require 'iconv'
> => true
>>> string = "Gesch\344ftsf\374hrer"
> => "Gesch\344ftsf\374hrer"
>>> puts Iconv.iconv('utf-8', 'iso-8859-1', string).first
> Geschäftsführer

lol, ok, now I feel stupid.  I've been messing around with this for 
hours!  Thanks very much.
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.