Win32 unicode filename support?

Hi,

I’m reading the win32-utils docs
( http://rubyforge.org/docman/?group_id=85 ) to try to determine
whether Unicode filename support is provided by the win32-file
module?

I didn’t see much mention of unicode, except in the win32-dir
docs, which mentioned unicode was not yet properly supported
on “create_junction”.

That leads me to suspect maybe unicode is supported on the
other routines.

If so, my next question would be, which encoding format should
the filename string be in when passed to, say, File.open() ?
UTF8 ?

Thanks for your help,

Bill

On Feb 12, 6:32 pm, “Bill K.” [email protected] wrote:

That leads me to suspect maybe unicode is supported on the
other routines.

If so, my next question would be, which encoding format should
the filename string be in when passed to, say, File.open() ?
UTF8 ?

Since File.open isn’t redefined in win32-file, it won’t make a
difference one way or the other. :slight_smile:

However, in general I would say that Unicode support is even worse
than I originally thought (though, no worse than stock Ruby). What I’m
considering is to seriously rework windows-pr such that I explicitly
define the ANSI and Wide versions of every function, then do something
like this in the source:

def GetFileAttributes(file)
if $KCODE == ‘UTF8’
GetFileAttributesW.call(file)
else
GetFileAttributesA.call(file)
end
end

By using the -Ku option (or just setting $KCODE directly) you would
get the wide version. Otherwise, you get the ANSI version. This seemed
to work pretty well in the few tests that I’ve done.

The most difficult part for authors using these functions would be to
remember to convert strings to wide character versions first, before
passing them to the functions. I provided a couple helper methods for
that - multi_to_wide and wide_to_multi, that handles the most typical
cases.

Regards,

Dan

Hi,

From: “Daniel B.” [email protected]

On Feb 12, 6:32 pm, “Bill K.” [email protected] wrote:

If so, my next question would be, which encoding format should
the filename string be in when passed to, say, File.open() ?
UTF8 ?

Since File.open isn’t redefined in win32-file, it won’t make a
difference one way or the other. :slight_smile:
[…]
I provided a couple helper methods for
that - multi_to_wide and wide_to_multi, that handles the most typical
cases.

Sorry if I’ve misunderstood your post - but does this mean
that there’s currently no way to open/create a file with a
unicode filename with the win32 tools? Or does this mean
that the multi_to_wide methods already exist, and that I
should be using those?

Thanks,

Bill

On Feb 12, 9:49 pm, “Bill K.” [email protected] wrote:

Since File.open isn’t redefined in win32-file, it won’t make a
difference one way or the other. :slight_smile:
[…]
I provided a couple helper methods for
that - multi_to_wide and wide_to_multi, that handles the most typical
cases.

Sorry if I’ve misunderstood your post - but does this mean
that there’s currently no way to open/create a file with a
unicode filename with the win32 tools?

At the moment, no.

Or does this mean
that the multi_to_wide methods already exist, and that I
should be using those?

Even with multi_to_wide you would still have to define the wide
character functions yourself because I haven’t done it yet in windows-
pr. I’m doing it right now, but it will be some time before I’m
finished. We’re talking hundreds of functions here.

But, you could do use a mix of windows-pr functions and custom defined
functions using Win32API. I’ll provide a sample later this week when I
get some more time.

Regards,

Dan

Hi,

Is there any new progress in support of unicode filenames? Plans?

Bill: Did you find another solution?

Daniel B. wrote:

On Feb 12, 9:49 pm, “Bill K.” [email protected] wrote:

Since File.open isn’t redefined in win32-file, it won’t make a
difference one way or the other. :slight_smile:
[…]
I provided a couple helper methods for
that - multi_to_wide and wide_to_multi, that handles the most typical
cases.

Sorry if I’ve misunderstood your post - but does this mean
that there’s currently no way to open/create a file with a
unicode filename with the win32 tools?

At the moment, no.

Or does this mean
that the multi_to_wide methods already exist, and that I
should be using those?

Even with multi_to_wide you would still have to define the wide
character functions yourself because I haven’t done it yet in windows-
pr. I’m doing it right now, but it will be some time before I’m
finished. We’re talking hundreds of functions here.

But, you could do use a mix of windows-pr functions and custom defined
functions using Win32API. I’ll provide a sample later this week when I
get some more time.

Regards,

Dan

From: “Jarek Kubos” [email protected]

Bill: Did you find another solution?

I went with a crude workaround. In our case, we’re embedding ruby into
a C++ app, so I added hooks so that I could call the subset of our
unicode-aware file I/O routines on the C++ side that I actually needed,
from ruby.

If I had needed anything more comprehensive than just a few routines
like “read_binary_file(path)” and “write_binary_file(path, data)”, I
might
have considered a different approach, like actually patching the win32
ruby source to convert utf8 to utf16 at the last moment before making
the equivalent “wide” API calls.

The last mention of win32 unicode API support I saw was in a ruby-core
message [ruby-core:17759] , entitled, “Ruby 1.9.1 Feature and 1.9.0-3
release plan”.

See the “What will not” section, below:

== Rough plan of the 1.9.1 features
=== What will be included in 1.9.1

  • default values in block parameters
  • built-in coverage measurement
  • improved version of transcode
  • transcode C API (experimental; only for internal use)
  • unicode console support on mswin32, mswin64 and mingw32
  • some platforms will no longer be supported.

=== What will not

  • miniunit (we need more discussion. it will possibly be included)
  • Win32 unicode API support (1.9.2?)
  • one-byte trap instruction (1.9.2?)
  • dtrace (1.9.2?)
  • Multi-VM (1.9.5?)

It’s encouraging that it’s at least on the list. But it looks like it
may
be awhile.

Regards,

Bill