Mechanize and encoding


#1

I’m trying to scrape a page that both HTTP-header and the HMTL document
claim is UTF-8, but all special characters are substituted by a question
mark when I use Mechanize/Hpricot to scrape some accented strings and
save to a local file. I suspect the page is in “ISO-8859-1”, but I’m not
sure.

I have tried using the"ruby -Ku" and also the $KCODE=‘u’ option without
success.

How can I force Mechanize to read the doc as “ISO-8859-1”?

I understand that Iconv can convert encoding, but just can’t see how I
can use it with Mechanize…

Thanks,
Marius


#2

I have had exactly the same problem and the same question.

It seems I solve it with $KCODE =‘UTF8’.