Forum: Ruby Ruby 1.9 still cannot list all files on Vista or XP?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
winter h. (Guest)
on 2009-04-08 09:59
I just tried using Ruby 1.9 and it seemed that it still cannot list all
files in a folder on XP or Vista when the filenames contain Chinese
characters, Japanese characters, or any foreign characters other than
English.

These two methods are used:   entries and glob

files = Dir.new(basedir).entries

Dir.chdir(basedir)
files = Dir.glob("*");

both methods show ????????.txt   when the filename has foreign
characters.  Can Ruby 1.9 readily handle this task rather than resorting
to Win32API?  Thanks.
Bosko I. (Guest)
on 2009-04-08 11:36
(Received via mailing list)
On Apr 8, 7:59 am, SpringFlowers AutumnMoon 
<removed_email_address@domain.invalid>
wrote:
> files = Dir.glob("*");
>
> both methods show ????????.txt   when the filename has foreign
> characters.  Can Ruby 1.9 readily handle this task rather than resorting
> to Win32API?  Thanks.
> --
> Posted viahttp://www.ruby-forum.com/.

What is code page of your command prompt? When ??????.txt is shown in
the console on Windows it is usually problem of code page settings.
Ryan D. (Guest)
on 2009-04-08 11:36
(Received via mailing list)
On Apr 7, 2009, at 22:59 , SpringFlowers AutumnMoon wrote:

> both methods show ????????.txt   when the filename has foreign
> characters.  Can Ruby 1.9 readily handle this task rather than
> resorting
> to Win32API?  Thanks.

show where? on what? what text encodings does it handle? what text
encodings did you set ruby up for?

On OSX:

% cd x
% touch ☃
% ls
☃
% ruby -e 'p Dir["*"]'
["\342\230\203"]
% ruby -KU -e 'p Dir["*"]'
["☃"]
% ~/.multiruby/install/1.9.1-p0/bin/ruby -e 'p Dir["*"]'
["☃"]

You've got 2 sides to this equation, ruby's encodings, and your
environment's encodings.
Heesob P. (Guest)
on 2009-04-08 11:59
(Received via mailing list)
2009/4/8 Ryan D. <removed_email_address@domain.invalid>:
> On OSX:
> ["☃"]
>
> You've got 2 sides to this equation, ruby's encodings, and your
> environment's encodings.
>
This is Windows specific issue.
Refer to the OP's original posting
http://www.ruby-forum.com/topic/163681

As far as I know, this issue is not fixed in ruby 1.9.1

Regards,

Park Heesbob
Ryan D. (Guest)
on 2009-04-08 12:18
(Received via mailing list)
On Apr 8, 2009, at 00:58 , Heesob P. wrote:

> This is Windows specific issue.
> Refer to the OP's original posting http://www.ruby-forum.com/topic/163681
>
> As far as I know, this issue is not fixed in ruby 1.9.1

then his email doesn't belong here, it should go to ruby-core@
winter h. (Guest)
on 2009-04-08 13:46
Ryan D. wrote:
> On Apr 8, 2009, at 00:58 , Heesob P. wrote:
>
>> This is Windows specific issue.
>> Refer to the OP's original posting http://www.ruby-forum.com/topic/163681
>>
>> As far as I know, this issue is not fixed in ruby 1.9.1
>
> then his email doesn't belong here, it should go to ruby-core@

then can somebody file it in ruby-core... maybe as a bug or improvement?
for my love of Ruby... i'd like to see it work fine on Windows XP or
Vista... it is the year 2009... and we are a long way into unicode and
i18n issues...  if Ruby cannot handle listing of files properly in its
latest version for Windows which is probably the most popular OS...
then... please can it be made to work well?
Bill K. (Guest)
on 2009-04-08 14:38
(Received via mailing list)
From: "Heesob P." <removed_email_address@domain.invalid>
>
> As far as I know, this issue is not fixed in ruby 1.9.1

Hmm.  If I have correctly understood matz in [ruby-core:20110] ,
Unicode path support for windows was supposed to be fixed:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/...

"In short, if you're using UTF-8 for your program encoding, you
should not see any problem  (if you do, it's a bug)."


Regards,

Bill
winter h. (Guest)
on 2009-04-08 14:52
Bill K. wrote:
> From: "Heesob P." <removed_email_address@domain.invalid>
>>
>> As far as I know, this issue is not fixed in ruby 1.9.1
>
> Hmm.  If I have correctly understood matz in [ruby-core:20110] ,
> Unicode path support for windows was supposed to be fixed:
>
> http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/...
>
> "In short, if you're using UTF-8 for your program encoding, you
> should not see any problem  (if you do, it's a bug)."
>

is it by
# coding: utf-8
or
# encoding: utf-8

?  are those for specifying that the current program file is in UTF8 ?
winter h. (Guest)
on 2009-04-08 23:43
SpringFlowers AutumnMoon wrote:
> Bill K. wrote:
>> From: "Heesob P." <removed_email_address@domain.invalid>
>>>
>>> As far as I know, this issue is not fixed in ruby 1.9.1
>>
>> Hmm.  If I have correctly understood matz in [ruby-core:20110] ,
>> Unicode path support for windows was supposed to be fixed:
>>
>> http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/...
>>
>> "In short, if you're using UTF-8 for your program encoding, you
>> should not see any problem  (if you do, it's a bug)."
>>
>
> is it by
> # coding: utf-8
> or
> # encoding: utf-8
>
> ?  are those for specifying that the current program file is in UTF8 ?

does someone know how to solve this?   to make some file have
international characters, it is really simple:  can go to Google News
and look at news from China or Taiwan or Hong Kong, and then copy and
paste the text into a filename on Windows XP or Vista.   thanks.
Bill K. (Guest)
on 2009-04-09 02:01
(Received via mailing list)
From: "SpringFlowers AutumnMoon" <removed_email_address@domain.invalid>
> and look at news from China or Taiwan or Hong Kong, and then copy and
> paste the text into a filename on Windows XP or Vista.   thanks.

Sorry, I haven't made the time to experiment with ruby1.9 much yet.
(Even though I am interested in this feature.)

Here are a couple threads from ruby-core that show examples of
using the # encoding: UTF-8 tag.

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/...

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/...


(Warning: It looks like the ruby-core mailing list archive software
itself
doesn't handle the encoding, and so the messages are displayed with
bogus remnants of the quoted-printable syntax left over like =3D and
 =20. )


But anyway... As I understand it, before you paste characters into
your editor, you'll need to make sure your editor is using UTF-8
encoding for the file you're editing.  And put the #encoding: UTF-8
tag at the top of the file.


Hope this helps,

Bill
winter h. (Guest)
on 2009-04-09 02:44
Bill K. wrote:

>
> But anyway... As I understand it, before you paste characters into
> your editor, you'll need to make sure your editor is using UTF-8
> encoding for the file you're editing.  And put the #encoding: UTF-8
> tag at the top of the file.

ah it is not really about using UTF-8 in my program file... it is about
getting UTF-8 file listing on Vista and XP.
Bill K. (Guest)
on 2009-04-09 06:39
(Received via mailing list)
From: "SpringFlowers AutumnMoon" <removed_email_address@domain.invalid>
>
> ah it is not really about using UTF-8 in my program file... it is about
> getting UTF-8 file listing on Vista and XP.

Oh.  When you wrote:

>    to make some file have
> international characters, it is really simple:  can go to Google News
> and look at news from China or Taiwan or Hong Kong, and then copy and
> paste the text into a filename on Windows XP or Vista.

...I misunderstood and thought you meant pasting the characters into
your ruby source file.  (I see now you were talking about a filename.)


Well OK - so I built the latest from the ruby 1.9.1 branch in
subversion,
and attempted to have ruby read a directory containing a filename
with chinese characters, and then open and read the contents of the
file...

My script was:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (win32_unicode.rb) ~~~
# encoding: UTF-8

files = Dir["T:/zz/*.txt"]

x = files.first
p x, x.encoding

dat = open(x, "r:UTF-8") {|f| f.read}

p dat, dat.encoding
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The result was:

ruby19 win32_unicode.rb
"T:/zz/???????.txt"
#<Encoding:UTF-8>
win32_unicode.rb:8:in `initialize': Invalid argument - T:/zz/???????.txt
(Errno::EINVAL)
        from win32_unicode.rb:8:in `open'
        from win32_unicode.rb:8:in `<main>'

I also tried with the -U flag and -E UTF-8 flag:

ruby19 -E UTF-8 win32_unicode.rb
"T:/zz/???????.txt"
#<Encoding:UTF-8>
win32_unicode.rb:8:in `initialize': Invalid argument - T:/zz/???????.txt
(Errno::EINVAL)
        from win32_unicode.rb:8:in `open'
        from win32_unicode.rb:8:in `<main>'

ruby19 -U win32_unicode.rb
"T:/zz/???????.txt"
#<Encoding:UTF-8>
win32_unicode.rb:8:in `initialize': Invalid argument - T:/zz/???????.txt
(Errno::EINVAL)
        from win32_unicode.rb:8:in `open'
        from win32_unicode.rb:8:in `<main>'

ruby19 -v
ruby 1.9.1p0 (2009-03-04) [i386-mswin32_71]


Note, it doesn't bother me that the filename displays as ???????.txt in
the
command window, but rather the issue that ruby seems unable to open
a filename it just obtained via Dir[].

So, unless I have bungled my test somehow, it seems likely there is a
problem.

If so, as Ryan pointed out, we should move this to the ruby-core list.


Regards,

Bill
winter h. (Guest)
on 2009-04-09 06:45
Bill K. wrote:

> Note, it doesn't bother me that the filename displays as ???????.txt in
> the
> command window, but rather the issue that ruby seems unable to open
> a filename it just obtained via Dir[].
>
> So, unless I have bungled my test somehow, it seems likely there is a
> problem.
>
> If so, as Ryan pointed out, we should move this to the ruby-core list.


yeah, it looks like the file name is actually stored as ???????.txt, not
just when printed out.

Mr. Park Heesbob had a solution but it involved using Win32.  If Ruby
1.9 can handle it without using Win32 that'd be great.
winter h. (Guest)
on 2009-04-09 09:47
SpringFlowers AutumnMoon wrote:
> Bill K. wrote:
>
>> Note, it doesn't bother me that the filename displays as ???????.txt in
>> the
>> command window, but rather the issue that ruby seems unable to open
>> a filename it just obtained via Dir[].
>>
>> So, unless I have bungled my test somehow, it seems likely there is a
>> problem.
>>
>> If so, as Ryan pointed out, we should move this to the ruby-core list.
>
>
> yeah, it looks like the file name is actually stored as ???????.txt, not
> just when printed out.
>
> Mr. Park Heesbob had a solution but it involved using Win32.  If Ruby
> 1.9 can handle it without using Win32 that'd be great.


actually... the solution that Park posted involved

files = `cmd /u /c dir /b `.split("\r\000\n\000")
which is to execute a system exe...  gee...

the complete solution at:
http://www.ruby-forum.com/topic/163681

(need to use the line above and some Win32API calls)
winter h. (Guest)
on 2009-04-10 01:22
I wonder if people use Ruby in Japan or France, how is the characters
handled on Win XP or Vista?

For example, to write a script that will look at all files and if the
file name contains a word in Japanese or French, then back it up to
another hard disk.  Even this task is not possible?
winter h. (Guest)
on 2009-04-10 01:37
Bosko I. wrote:

> What is code page of your command prompt? When ??????.txt is shown in
> the console on Windows it is usually problem of code page settings.

i actually used  each_byte to dump out the bytes in the file name...
they actually show the ASCII of the question mark...   so the string got
back really is question mark, not related to the command prompt.
winter h. (Guest)
on 2009-04-14 10:07
Mr. Park Heesbob, I wonder if you are actually changing Ruby 1.9 so that
it will handle it?  I see you file a bug on Ruby Core related to this.
thanks.
Heesob P. (Guest)
on 2009-04-14 10:33
(Received via mailing list)
Hi,

2009/4/14 SpringFlowers AutumnMoon <removed_email_address@domain.invalid>:
> Mr. Park Heesbob, I wonder if you are actually changing Ruby 1.9 so that
> it will handle it?  I see you file a bug on Ruby Core related to this.
> thanks.
>
I want to change it if I could, but it is beyond my ability.

According to
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/... ,
It will be handled in Ruby 1.9.2.

Regards,

Park H.
This topic is locked and can not be replied to.