The following code correctly determines that a string is not UTF-8 in
ruby 1.8.6, but in JRuby this doesn’t work.
require ‘iconv’
require ‘stringio’
non_utf8 = “\xa4”
utf8 = Iconv.conv(‘UTF-8//IGNORE’, ‘UTF-8’, non_utf8)
puts utf8 == non_utf8 # returns true, which is wrong
What is causing this? What can I use as a workaround?
–Serguei
This is a bug…if you have not filed it already, please do so.
In MRI, this produces “false” yes? I’m not familiar with the
“UTF-1//IGNORE” bit there, what does that mean?
On Mon, Oct 5, 2009 at 7:44 PM, Serguei F. [email protected]
wrote:
require ‘stringio’
–Serguei
To unsubscribe from this list, please visit:
http://xircles.codehaus.org/manage_email
Er, UTF-8//IGNORE
On Sat, Oct 10, 2009 at 4:14 PM, Charles Oliver N.
[email protected] wrote:
puts utf8 == non_utf8 # returns true, which is wrong
–Serguei
To unsubscribe from this list, please visit:
http://xircles.codehaus.org/manage_email
On Sun, Oct 11, 2009 at 12:14 AM, Charles Oliver N.
[email protected] wrote:
Er, UTF-8//IGNORE
If you append the string //IGNORE, characters that cannot be
represented in the target charset are silently discarded.
In our particular case, with IGNORE option the conversion should
produce an empty string, and without IGNORE it should raise
Iconv:IllegalSequence.
JRuby in both cases just happily returns “\244” 
Thanks,
–Vladimir
To unsubscribe from this list, please visit:
http://xircles.codehaus.org/manage_email
I’ve created an issue for this:
http://jira.codehaus.org/browse/JRUBY-4091
On Mon, Oct 12, 2009 at 6:29 PM, Charles Oliver N.
Thinking about this problem, I was wondering why iconv doesn’t raise any
exception for that string and I’ve seen that it’s also ignoring
exceptions.
I’m working on a patch to fix both problems.
On Sun, Oct 11, 2009 at 1:28 AM, Vladimir S. [email protected]
wrote:
JRuby in both cases just happily returns “\244” 
Well, I guess we need a bug filed (if one hasn’t been filed already).
Can you file a bug please, Serguei?
To unsubscribe from this list, please visit:
http://xircles.codehaus.org/manage_email