I got code in ruby 1.8. Iconv.iconv('gbk', 'utf-8', string) now, ruby 1.9.2 has force_encoding('utf-8'), so can I just forceing_encoding('utf-8') ?
on 2011-02-01 03:55
on 2011-02-01 04:41
On 2011/02/01 11:55, Zhenning Guan wrote: > I got code in ruby 1.8. > Iconv.iconv('gbk', 'utf-8', string) > > now, ruby 1.9.2 has force_encoding('utf-8'), so can I just > forceing_encoding('utf-8') ? No. force_encoding just changes the encoding label, but leaves the bytes in the string as they are. That would result in garbage (unless everything is ASCII anyway). The main use of force_encoding is to set encoding labels for raw byte strings (e.g. coming from outside) when knowing already what the encoding is. The equivalent of your Iconv call, in Ruby 1.9, is: string.encode('gbk', 'utf-8') But I'm a bit vary about the order of the arguments. Both Iconv.iconv('gbk', 'utf-8', string) string.encode('gbk', 'utf-8') encode from UTF-8 to GBK, but the result of force_encoding('utf-8') is UTF-8, so if you want the result to be UTF-8, you have to turn the order of the parameters around. I was never happy with the TO-FROM order in iconv, and I'm also not happy with the TO-FROM order in String#encode, but String#encode can also be used just with the TO parameter, e.g. just string.encode('gbk') if the string has the correct encoding at this point. So when we (Matz and me, mainly) designed String#encode, unfortunately TO-FROM was the only order that made sense. Please also note that there might be slight differences between Iconv and String#encode for some characters, but these should be very small in number. Regards, Martin. -- #-# Martin J. Drst, Professor, Aoyama Gakuin University #-# http://www.sw.it.aoyama.ac.jp mailto:firstname.lastname@example.org