Utf-8 htmlentities,decode keeps switching to ascii-8bit

Hi there,
it’s not really rubyonrails but because it fits into the web-development
I think I’m in the right forum.

I’m trying to decode a string I read from an online xml-file and compare
it to another, locally stored one.

The locally stored string is:
p localstring #=> “~St0rm€lite~”
p localstring.encoding #=> #Encoding:UTF-8

The xml-file is saved as utf-8 and was read via HPricot.
p remotestring #=> “~St0rm€lite~”
p remotestring.encoding #=> #Encoding:UTF-8

Now I want to decode the remote string. To do this I’m using
htmlentities4.2.0 installed via gem, however it breaks the string
require ‘htmlentities’
htmldecoder = HTMLEntities.new
p htmldecoder.decode(remotestring) #=> “~St0rm\xE2\x82\xAClite~”
p htmldecoder.decode(remotestring).encoding #=> #Encoding:ASCII-8BIT

I’m trying to fix this the past 3-4 days but I didn’t find a solution.
I tried to convert it again to utf-8 but this make the string getting
even worser.
Also converting the local-string to ASCII-8BIT fails.

I looked up my system configuration:
My shell is running as UTF-8, also I installed ‘locales’:
p Locale.current #=> [#<Locale::Tag::Posix: en_US.UTF-8>]

Is there any other settings I need to look up and configure? If I try
the script on another computer I get the right results but here it keeps
switching to ASCII-8BIT.
My Ruby version is 1.9

I would be happy if someone could help me.

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs