Iconv weirdness on Windows XP


#1

Is anyone else having this problem?
This is a One-Click install of Ruby, with the iconv package installed.
charset.dll and iconv.dll are in c:\windows\system32
iconv.so is in the appropriate place deep within the Ruby folder
hierarchy.

Here’s a sample IRB session (sorry about the wrapping):
Microsoft Windows XP [Version 5.1.2600] © Copyright 1985-2001
Microsoft Corp.

irb(main):001:0> require ‘iconv’ => true
irb(main):002:0> Iconv.iconv(‘utf-8’, ‘X-UNKNOWN’, ‘Hello, world’)
Errno::ENOENT: No such file or directory - iconv(“utf-8”, “X-UNKNOWN”)
from (irb):2:in `iconv’ from (irb):2
irb(main):003:0> Iconv.iconv(‘utf-8’, ‘X-UNKNOWN’, ‘Hello, world’)
(irb):3: [BUG] rb_sys_fail(iconv(“utf-8”, “X-UNKNOWN”)) - errno
== 0 ruby 1.8.2 (2004-12-25) [i386-mswin32]

This application has requested the Runtime to terminate it in an
unusual way. Please contact the application’s support team for more
information.

C:\Bin>

I first ran into this when running part of the TMail test suite, but I
can now easily duplicate it in IRB.

Thanks,
–Wilson.


#2

Wilson B. asked:

irb(main):001:0> require ‘iconv’ => true
irb(main):002:0> Iconv.iconv(‘utf-8’, ‘X-UNKNOWN’, ‘Hello, world’)
Errno::ENOENT: No such file or directory - iconv(“utf-8”, “X-UNKNOWN”)
from (irb):2:in `iconv’ from (irb):2

The first problem here is ENOENT. That’s C’s “not found”, presumably
referring to the character set “X-UNKNOWN” (not any “file or
directory”).
I’m told [1], this can occur because the config.charset aliases are not
available. But, actually, I can’t find x-unknown in the aliases file,
and
don’t know enough about iconv to know if it should be handling it as an
intrinsic type.

For x-unknown, maybe try substituting “char” or the empty string “”;
this
means the locale-dependant default encoding (so isn’t the same as
x-unknown).

irb(main):003:0> Iconv.iconv(‘utf-8’, ‘X-UNKNOWN’, ‘Hello, world’)
(irb):3: [BUG] rb_sys_fail(iconv(“utf-8”, “X-UNKNOWN”)) - errno
== 0 ruby 1.8.2 (2004-12-25) [i386-mswin32]

This application has requested the Runtime to terminate it in an
unusual way. Please contact the application’s support team for more
information.

C:\Bin>

I’m kinda guessing, but it looks like this is a problem with the Iconv
library failing to set errno for the second failure.

I first ran into this when running part of the TMail test suite, but I
can now easily duplicate it in IRB.

Thanks for putting the effort in; it makes responding easy.

Cheers,
Dave

[1] Nobu Nakada,
http://www.ruby-talk.org/cgi-bin/scat.rb/ruby/ruby-talk/158960


#3

On 12/13/05, Dave B. removed_email_address@domain.invalid wrote:

x-unknown).

When I switch ‘X-UNKNOWN’ to ‘char’ in the test fixture, everything
works fine.
Specifically, the code that freaks out is:
def convert_to(text, to, from)
return text unless to && from
text ? Iconv.iconv(to, from, text).first : “”
rescue Iconv::IllegalSequence, Errno::EINVAL

the ‘from’ parameter specifies a charset other than what the text

actually is…not much we can do in this case but just return the

unconverted text.

Ditto if either parameter represents an unknown charset, like

X-UNKNOWN.

text
end

Given that, it looks like you’re right about the library not setting
the proper error environment, because the method that invokes it is
expecting it to throw an error.
Oddly, though, ENOENT isn’t on the list of exceptions to rescue there.
Adding it to the list doesn’t stop the crash bug, unfortunately.

I tried putting config.charset in both c:/windows/system32, and in
c:/ruby/lib/ruby/1.8/i386-mswin32 (where iconv.so goes), to no avail.


#4

Hi,

At Wed, 14 Dec 2005 00:17:39 +0900,
Dave B. wrote in [ruby-talk:170489]:

I’m kinda guessing, but it looks like this is a problem with the Iconv
library failing to set errno for the second failure.

This failure often occurs if msvcrt DLL versions mismatch.
e.g., ruby and iconv.so use msvcr71.dll whereas iconv.dll uses
msvcrt.dll.


#5

On 12/13/05, Dave B. removed_email_address@domain.invalid wrote:

I wrote just before:

My best guess is that it’s an incompatibility problem between One-Click’s
ruby core and iconv.so from the ruby-mswin32 package.

OK, so Nobu’s confirmed this. (Thanks, Nobu.)

I’m currently working on the One-Click Installer for 1.8.4, and this
is one of the things I want to make sure I get right. Any help/advice
would be more than welcome (especially since I know next-to-nothing
about iconv).

Thanks,
Curt


#6

Wilson B. wrote:
On 12/13/05, Dave B. removed_email_address@domain.invalid wrote:

When I switch ‘X-UNKNOWN’ to ‘char’ in the test fixture, everything works
fine.

Cool.

Specifically, the code that freaks out is:
def convert_to(text, to, from)

rescue Iconv::IllegalSequence, Errno::EINVAL

Given that, it looks like you’re right about the library not setting
the proper error environment, because the method that invokes it is
expecting it to throw an error.
Oddly, though, ENOENT isn’t on the list of exceptions to rescue there.
Adding it to the list doesn’t stop the crash bug, unfortunately.

It shouldn’t be Iconv::IllegalSequence - that’s a “bad character” in the
“from” string. There’s an exception defined in the library,
Iconv::InvalidEncoding - that makes sense, but apparently isn’t used
here.
The solution may be to use that instead of Errno.

I haven’t seen Errno::EINVAL thrown by iconv, although that’s what
ruby-mswin32-1.8.2 does in this situation. And it doesn’t crash.

So you’ve uncovered a bug in my iconv package for the One-Click
Installer.
My best guess is that it’s an incompatibility problem between
One-Click’s
ruby core and iconv.so from the ruby-mswin32 package.

I tried putting config.charset in both c:/windows/system32, and in
c:/ruby/lib/ruby/1.8/i386-mswin32 (where iconv.so goes), to no avail.

Nobu’s post seemed to indicate the aliases file wasn’t going to work on
Windows. It may need an option compiled in to the binary or something. I
don’t know.

Cheers,
Dave


#7

On 12/14/05, Curt H. removed_email_address@domain.invalid wrote:

about iconv).

It’s working fine for me now with the 1.8.4 object file, and I’m
running OneClick1.8.2-15.
Are there any other Iconv test suites out there I can run, to make
sure it didn’t mangle anything up?

Thanks,
–Wilson.


#8

Wilson B. wrote:

Cool. Thank you for this. I replaced my iconv.so file with the one
from the zip file above, and as you predicted, things now work this
way:
…F…

Time for a change to that ‘rescue’ clause, I’d say.

Nobu informs us that this replacement of Errno -> Exception is one of
two
minor changes in iconv between 1.8.2 and 1.8.4. I’ve updated my zip to
include the super-versioned iconv.so.

I’m glad it’s working for you now, and thanks for raising the issue and
helping solve it.

Cheers,
Dave


#9

Curt H. wrote:

I’m currently working on the One-Click Installer for 1.8.4, and this
is one of the things I want to make sure I get right. Any help/advice
would be more than welcome (especially since I know next-to-nothing
about iconv).

Well, the core of the problem seems to be a difference in compiler
versions
between your One-Click binaries and the “borrowed” mswin32 binary. The
problem shows itself in this case when a C errno is used instead of a
Ruby
Exception.

Nobu says in [ruby-core:06889] you should compile it from source as part
of
the One-Click build:

IMHO, it would be no longer a good idea to assemble pre-built
binaries now, on Windows. I suspect packagers may have to
build all binaries from sources by themselves.

Wilson B. wrote:

It’s working fine for me now with the 1.8.4 object file, and I’m
running OneClick1.8.2-15.
Are there any other Iconv test suites out there I can run, to make
sure it didn’t mangle anything up?

I don’t know of any, sorry.

Cheers,
Dave


#10

On 12/14/05, Curt H. removed_email_address@domain.invalid wrote:

Thanks Dave, I already had you on my list of people to contact if/when
I run into trouble.

I’m planning to go through all the extensions included with One-Click
installer (many of which I pick up in binary form) to see how there
were compiled.

Cool. This issue has bugged me, so I’m writing a test suite for Iconv,
in order to better understand it.


#11

Thanks Dave, I already had you on my list of people to contact if/when
I run into trouble.

I’m planning to go through all the extensions included with One-Click
installer (many of which I pick up in binary form) to see how there
were compiled.

Curt


#12

On 12/14/05, Wilson B. removed_email_address@domain.invalid wrote:

Hrm. The Win32 version of iconv seems to support far fewer encodings.
There are only 296 encodings that are valid, out of a total of 959
supported by iconv according to iconv -l.
That same test script yields 936 encodings on a SuSE Linux box.
Oddly, there are some encodings that work on Win32, but not on the
Linux system. This one is an example:
Iconv.new(‘ISO_646.IRV:1991’,‘ISO_646.IRV:1991’).iconv(’’)
Works in Windows, raises Errno::EINVAL in Linux.
On the other hand:
Iconv.new(‘500’,‘500’).iconv(’’)
raises Iconv::InvalidEncoding in Win32, works fine in Linux

Where does the Win32 version get its list of encodings?

Fun stuff.
–Wilson.


#13

I wrote just before:

My best guess is that it’s an incompatibility problem between One-Click’s
ruby core and iconv.so from the ruby-mswin32 package.

OK, so Nobu’s confirmed this. (Thanks, Nobu.)

Meanwhile:

It shouldn’t be Iconv::IllegalSequence - that’s a “bad character” in the
“from” string. There’s an exception defined in the library,
Iconv::InvalidEncoding - that makes sense, but apparently isn’t used here.
The solution may be to use that instead of Errno.

I haven’t seen Errno::EINVAL thrown by iconv, although that’s what
ruby-mswin32-1.8.2 does in this situation. And it doesn’t crash.

ruby-mswin32 1.8.4 preview 1 actually throws InvalidEncoding in this
case,
and this version of iconv.so doesn’t demonstrate this crashing bug with
One-Click Installer 1.8.2-15 (which I assume we’re all using).

Should I update my Iconv for One-Click package to use this new version?
I’m
concerned about potential incompatibility issues between it and Ruby
1.8.2.

In any case, you can get this version of iconv.so from in this zip:
ftp://ftp.ruby-lang.org/pub/ruby/binaries/mswin32/ruby-1.8.4-preview1-i386-mswin32.zip

Wilson B. wrote:

Specifically, the code that freaks out is:
def convert_to(text, to, from)

rescue Iconv::IllegalSequence, Errno::EINVAL

So this test fixture is still going to fail. It will need to be updated
like
so:
-rescue Iconv::IllegalSequence, Errno::EINVAL
+rescue Iconv::IllegalSequence, Iconv::InvalidEncoding

Cheers,
Dave


#14

On 12/14/05, Dave B. removed_email_address@domain.invalid wrote:

Iconv::InvalidEncoding - that makes sense, but apparently isn’t used here.
concerned about potential incompatibility issues between it and Ruby 1.8.2.
So this test fixture is still going to fail. It will need to be updated like
so:
-rescue Iconv::IllegalSequence, Errno::EINVAL
+rescue Iconv::IllegalSequence, Iconv::InvalidEncoding

Cool. Thank you for this. I replaced my iconv.so file with the one
from the zip file above, and as you predicted, things now work this
way:
C:\ruby\lib\ruby\gems\1.8\gems\actionmailer-1.1.5\test>ruby
mail_service_test.rb
Loaded suite mail_service_test
Started
…F…
Finished in 0.453 seconds.

  1. Failure:
    test_decode_message_with_unknown_charset(ActionMailerTest)
    [mail_service_test.rb:718]:
    Exception raised:
    Class: Iconv::InvalidEncoding
    Message: <“invalid encoding (“utf-8”, “X-UNKNOWN”)”>
    —Backtrace—
    ./…/lib/action_mailer/vendor/tmail/quoting.rb:82:in iconv' ./../lib/action_mailer/vendor/tmail/quoting.rb:82:inconvert_to’
    ./…/lib/action_mailer/vendor/tmail/quoting.rb:17:in unquoted_body' ./../lib/action_mailer/vendor/tmail/quoting.rb:43:inbody’
    mail_service_test.rb:718:in test_decode_message_with_unknown_charset' mail_service_test.rb:718:inassert_nothing_raised’
    mail_service_test.rb:718:in `test_decode_message_with_unknown_charset’

46 tests, 137 assertions, 1 failures, 0 errors

Time for a change to that ‘rescue’ clause, I’d say.

Thanks,
–Wilson.