Forum: Ruby files on a folder too many

97334b75c93574ed3e514f33849f0953?d=identicon&s=25 Mario Ruiz (tcblues)
on 2014-04-23 15:16
Hi,
I need to perform an action in every single file on a folder that
contains almost a million of files... but when i do:

Dir.foreach(dir) {|f|
#what i want to do in here
}

it takes forever to start

any idea how to do this
527f49becfec2eecdd18329eda5043a1?d=identicon&s=25 Durk B. (dirk_b)
on 2014-04-23 18:29
dir -m1 > <files-to-process-with-ruby-in-a-textfile-line-by-line>

Too obvious?
97334b75c93574ed3e514f33849f0953?d=identicon&s=25 Mario Ruiz (tcblues)
on 2014-04-25 11:55
Same problem, it takes to much, isn't there any way to go through all
the files without getting all the files at once

Durk B. wrote in post #1143875:
> dir -m1 > <files-to-process-with-ruby-in-a-textfile-line-by-line>
>
> Too obvious?
14b5582046b4e7b24ab69b7886a35868?d=identicon&s=25 Joel Pearson (virtuoso)
on 2014-04-27 22:49
If you have to do something to every file then you're stuck with
iterating through all of them. If you could filter them in some way that
would be helpful, but if you have to perform the action on all of them
you'll have to accept that it will take some time.

You might get minor performance differences by finding ways to get a
list of all the files, but eventually you'll have to start a loop with
almost a million iterations; and it sounds like there's no way to avoid
that.
5b972395a92333843018b4add8af0437?d=identicon&s=25 Damián M. González (igorjorobus)
on 2014-04-28 01:02
Did you tried an output like "xxxx files revised" after a file is
revised, to see that Ruby is not stuck, but is working. That way you can
see if Ruby has started or not.
97334b75c93574ed3e514f33849f0953?d=identicon&s=25 Mario Ruiz (tcblues)
on 2014-04-28 12:51
It seems like every Dir.foreach and Dir.entries(dir) collect all file in
that folder and then continue the execution... so it takes too much
time... my question was if in Ruby I can access to the folders array as
a position in memory and dynamically move that position to the next
item... so it is not necessary to collect all elements from the
beginning.
7e1614e9431deef2bf123693dd6bb59d?d=identicon&s=25 Vlad M_ (vladm)
on 2014-04-28 13:49
dir = Dir.new('.')

# Take the first 20 entries and create a map with file names

dir.take(20).flat_map {|item| item.to_s}

or


# Reads the next entry from dir and returns it as a string. Returns nil
at the end of the stream.
http://www.ruby-doc.org/core-2.1.1/Dir.html#method-i-read

dir.read


You might have a look into
http://www.ruby-doc.org/core-2.1.1/Enumerator/Lazy.html
Please log in before posting. Registration is free and takes only a minute.
Existing account

NEW: Do you have a Google/GoogleMail, Yahoo or Facebook account? No registration required!
Log in with Google account | Log in with Yahoo account | Log in with Facebook account
No account? Register here.