Moving all files in a folder to another hard drive

I have some code below to move all files in a folder to another hard
drive (which has 2TB of space). It runs well except that whenever a
filename has some international characters, then the line

if File.file?(basedir + file)

will fail. The file is printed as “Chart for ???.xls”

Does someone know how to solve this problem with Ruby being so powerful?
The program is running on Windows. (Vista or XP should both be ok).

code:

require ‘ftools’

basedir = “c:/data/”
target = “w:/data/”

Dir.chdir(basedir)
files = Dir.glob("*");

i = 1
files.each { |file|
p i, file
if File.file?(basedir + file)
puts “Moving…”
File.move(basedir + file, target + file)
puts “Now sleeping…”
sleep(60)
end
i += 1
}

Presumably the file names are being corrupted at some point. It could be
in Ruby, or it could be in Windows. My approach would be to put some
print statements in to find out where exactly the problem lies.

Dave B. wrote:

Presumably the file names are being corrupted at some point. It could be
in Ruby, or it could be in Windows. My approach would be to put some
print statements in to find out where exactly the problem lies.

actually, if i use

files.each { |file|
p i, file
file.each_byte {|c| print c, ’ ’ }
[…]

then the filename print out as a lot of 63, which is the ASCII of “?”,
so it looks like the filenames already come back bad…

using $KCODE = “u”; or ruby -Ku move.rb doesn’t seem to help. They
seem to be only indicating the file containing the code uses UTF-8
encoding.

Mark T wrote:

Is the reason you are moving the files something to do with
instability of the present volume?
The “???” is a clue.
It may be that the fat table has been corrupted at the entry being read.
Skip this entry.

oh thanks for your reminder. actually, whenever any filename has
international characters in it, then the filename will have ??? as
well… it happens to any folder and happens to my other computer too
which has RAID Mirroring. I think somehow, the international character
didn’t get thru into a Ruby string.

Is the reason you are moving the files something to do with
instability of the present volume?
The “???” is a clue.
It may be that the fat table has been corrupted at the entry being read.
Skip this entry.
Get what you can.
If the remainder is valuable, pay someone to recover it.

Good luck. (Not really joking, seriously, well, not real serious if ya
know… blabla…)

-------- Original-Nachricht --------

Datum: Mon, 25 Aug 2008 08:32:48 +0900
Von: SpringFlowers AutumnMoon [email protected]
An: [email protected]
Betreff: Re: Moving all files in a folder to another hard drive

[…]

Posted via http://www.ruby-forum.com/.

Have you tried to use Iconv to convert between encodings ?
In editors, texts full of questions marks suddenly become readable, if
the right encoding is
chosen…

http://www.ruby-doc.org/stdlib/libdoc/iconv/rdoc/classes/Iconv.html

Best regards,

Axel

so I thought Ruby 1.9 differ from Ruby 1.8 as Ruby 1.9’s String doesn’t
have to be ASCII… but if I use the following using Ruby 1.9, I still
get the 63 ASCII denoting “?” in the filenames.

basedir = “c:/data/”

i = 0
Dir.new(basedir).entries.each { |file|
p i, file
file.each_byte {|c| print c, ’ ’ }

if (i > 10)
break
end
i += 1
}

no matter which of the two methods is used:

files = Dir.new(basedir).entries

Dir.chdir(basedir)
files = Dir.glob("*");

then if i do

files.each { |file|
p i, file
file.each_byte {|c| print c, ’ ’ }

then whenever the filename has international characters, then the ASCII
code 63 is printed out a lot, meaning it is “?”. I wonder is it true
for Japanese version of Ruby too? Does it actually get back UTF-8 code
or JIS code?

-------- Original-Nachricht --------

Datum: Mon, 25 Aug 2008 20:19:27 +0900
Von: SpringFlowers AutumnMoon [email protected]
An: [email protected]
Betreff: Re: Moving all files in a folder to another hard drive

p i, file
file.each_byte {|c| print c, ’ ’ }

then whenever the filename has international characters, then the ASCII
code 63 is printed out a lot, meaning it is “?”. I wonder is it true
for Japanese version of Ruby too? Does it actually get back UTF-8 code
or JIS code?


Posted via http://www.ruby-forum.com/.

As far as I know, East Asian encodings use more than one bit to store a
character
(due to the huge amout of characters, they wouldn’t all fit into 256
places).
This might explain why you get ? when you write each_byte there.

I am not on Windows right now, so I am sure whether your international
files
all get copied to ?(repeat x times) or whether they are copied to names
which
are not correctly displayed in the file browser.

What you could do is CGI.escape and (maybe later CGI.unescape) them:

require “cgi”

files.each { |file|
p CGI.escape(file)

and then move files

}

That’s not elegant, but it will produce names with only % and ASCII
letters.

How does rio behave under Windows (http://rio.rubyforge.org/) ?

Best regards,

Axel

Presumably you’re using the NTFS filesystem? There’s some information on
it here:

Since the filesystem has a closed specification (thank you Microsoft),
it may be that the Ruby developers have been unable to work out exactly
how it works with filenames that contain Unicode characters.

Axel E. wrote:

require “cgi”

files.each { |file|
p CGI.escape(file)

and then move files

}

That’s not elegant, but it will produce names with only % and ASCII
letters.

How does rio behave under Windows (http://rio.rubyforge.org/) ?

thanks for helping. the cgi actually prints out %3F which is the ASCII
for “?”
since maybe the byte printed by each byte is 69, so encoding with CGI
method will give %3F

I tried Rio. It actually bypass all files with international characters
in it. So for example, if my directory has 10 files, and 3 of them with
the filename containing international characters in them, then
rio(basedir).files[‘.’] actually just return 7 files.

On Tue, 2008-08-26 at 01:07 +0900, Dave B. wrote:

Presumably you’re using the NTFS filesystem? There’s some information on
it here:

NTFS - Wikipedia

Since the filesystem has a closed specification (thank you Microsoft),
it may be that the Ruby developers have been unable to work out exactly
how it works with filenames that contain Unicode characters.

Yes … my advice is to “shell out” to Windows or call native C
libraries, rather than trying to “reverse engineer” NTFS.


M. Edward (Ed) Borasky
ruby-perspectives.blogspot.com

“A mathematician is a machine for turning coffee into theorems.” –
Alfréd Rényi via Paul Erdős

2008/8/24 SpringFlowers AutumnMoon [email protected]:

}
As a quick workaround, try this:

require ‘Win32API’
FILE_ATTRIBUTE_DIRECTORY = 0x10
MoveFileW = Win32API.new(‘kernel32’,‘MoveFileW’,‘PP’,‘I’)
GetFileAttributesW =
Win32API.new(‘kernel32’,‘GetFileAttributesW’,‘P’,‘L’)

basedir = “c:/data/”
target = “w:/data/”
basedirw = basedir.gsub(/(.)/,“\1\000”)
targetw = target.gsub(/(.)/,“\1\000”)

Dir.chdir(basedir)
files = cmd /u /c dir /b .split(“\r\000\n\000”)

i = 1
files.each {|file|
p i, file
if GetFileAttributesW.call(basedir + file) != FILE_ATTRIBUTE_DIRECTORY
puts “Moving …”
MoveFileW.call(basedirw+file+“\000”,targetw+file+“\000”)
puts “Now sleeping…”
end
i += 1
}

Regards,

Park H.

Heesob P. wrote:

2008/8/24 SpringFlowers AutumnMoon [email protected]:

}
As a quick workaround, try this:

require ‘Win32API’
FILE_ATTRIBUTE_DIRECTORY = 0x10
MoveFileW = Win32API.new(‘kernel32’,‘MoveFileW’,‘PP’,‘I’)
GetFileAttributesW =
Win32API.new(‘kernel32’,‘GetFileAttributesW’,‘P’,‘L’)

[…]

Wow, it really works! You rock, Park! Looks like one key line here is
the cmd /u /c dir /b, which is to get the filenames in unicode
characters.

After that, I tried

p “good” if File.file?(file)

in Ruby 1.8.6 and 1.9 and they both gave error that the filename
contains null character.

Hm, I wonder for people who use Ruby on Japanese Windows XP/Vista, or
European version of Windows, how do they deal with getting filenames
that has non-English characters?

Do we want to have a small speed competition and see which country can
provide the first solution using fairly standard Ruby? (without
resorting to win32api)?

Thanks a lot, Park!