i am trying to find a set of keys within specific files under a specific
directory. i read the keys from a file and iterate through them opening
and looking all the files under the specified directory. However only
the last key seems to be found in the files…
srcFiles = Dir.glob(File.join("**", “*.txt”))
keys = File.readlines(“sp.txt”)
keys.each{ |key|
srcFiles.each{|src|
linenumber = 0
File.readlines(src).each{ |line|
linenumber += 1
if line.include? key then
puts “found #{key}”
}
}
}
2010/5/3 Qnmt M. [email protected]:
linenumber = 0
File.readlines(src).each{ |line|
linenumber += 1
if line.include? key then
puts “found #{key}”
}
}
}
This is likely caused by the fact, that you do not postprocess what
you get from File.readlines:
$ echo 111 >| x
$ echo 222 >> x
$ ruby19 -e ‘p File.readlines(“x”)’
[“111\n”, “222\n”]
$
Note the trailing line delimiter.
Also, your approach is very inefficient: you open and read every file
of keys times. You better exchange outer and inner loop and open
each file only once while searching for all keys in one line.
Btw, what you attempt to do can be done by GNU find and fgrep already:
$ find . -type f -name ‘*.txt’ -print0 | xargs -r0 fgrep -f sp.txt
Or, with a shell that knows “**” expansion, e.g. zsh
$ fgrep -f sp.txt **/*.txt
If you are only interested in file names you can add option -l to fgrep.
Kind regards
robert
thanks for your reply and advices robert.
the problem was really about postprocessing the result of File.readlines
and your idea about switching the loop order significantly improved the
performance.
about doing the same thing with GNU commands, i wrote this for windows
environment and not sure if it has such a command utility
cem