UTF-8 string conversion difficulties in JRuby

The following code correctly determines that a string is not UTF-8 in
ruby 1.8.6, but in JRuby this doesn’t work.

require ‘iconv’

require ‘stringio’

non_utf8 = “\xa4”

utf8 = Iconv.conv(‘UTF-8//IGNORE’, ‘UTF-8’, non_utf8)

puts utf8 == non_utf8 # returns true, which is wrong

What is causing this? What can I use as a workaround?

–Serguei

This is a bug…if you have not filed it already, please do so.

In MRI, this produces “false” yes? I’m not familiar with the
“UTF-1//IGNORE” bit there, what does that mean?

On Mon, Oct 5, 2009 at 7:44 PM, Serguei F. [email protected]
wrote:

require ‘stringio’

–Serguei


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Er, UTF-8//IGNORE

On Sat, Oct 10, 2009 at 4:14 PM, Charles Oliver N.
[email protected] wrote:

puts utf8 == non_utf8 # returns true, which is wrong
–Serguei


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

On Sun, Oct 11, 2009 at 12:14 AM, Charles Oliver N.
[email protected] wrote:

Er, UTF-8//IGNORE

If you append the string //IGNORE, characters that cannot be
represented in the target charset are silently discarded.

In our particular case, with IGNORE option the conversion should
produce an empty string, and without IGNORE it should raise
Iconv:IllegalSequence.

JRuby in both cases just happily returns “\244” :slight_smile:

Thanks,
–Vladimir


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

I’ve created an issue for this:
http://jira.codehaus.org/browse/JRUBY-4091

On Mon, Oct 12, 2009 at 6:29 PM, Charles Oliver N.

Thinking about this problem, I was wondering why iconv doesn’t raise any
exception for that string and I’ve seen that it’s also ignoring
exceptions.
I’m working on a patch to fix both problems.

On Sun, Oct 11, 2009 at 1:28 AM, Vladimir S. [email protected]
wrote:

JRuby in both cases just happily returns “\244” :slight_smile:

Well, I guess we need a bug filed (if one hasn’t been filed already).
Can you file a bug please, Serguei?

  • Charlie

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs