Hi all,
I’m trying to write a little script to read files in a directory (x
bytes at a time), do an md5 checksum of the bytes and print them to a
text file.
Everything is working fine at the moment except the reading, I have
managed to get it to read the first x bytes from the file but I’m not
sure how to get it to keep reading while the EOF hasn’t been reached.
This is what I want to achieve:
- Specify a blocksize to use (not a problem)
- Read a file in chunks (using the blocksize)
- md5 checksum the bytes (i’ve worked this part out)
- write the md5sum to a file (i’ve got this also)
How can I retrieve the chunks until the EOF, maybe returning a smaller
chunk at the end if there isn’t enough data left.
I hope this post isn’t too badly written, it’s very late at night and
i’ve been googling this for ages
Any help much appreciated.
Matt
I’ve just played around and found this seems to work:
File.open(path, “r”) do |fh|
while (chunk = fh.read(blocksize))
outFH.puts Digest::MD5.hexdigest(chunk) + “\n”
end
end
Is this a good way to do it?
Thanks
Matt
On 19 Aug., 02:21, Matt H. [email protected] wrote:
I’ve just played around and found this seems to work:
File.open(path, “r”) do |fh|
while (chunk = fh.read(blocksize))
outFH.puts Digest::MD5.hexdigest(chunk) + “\n”
end
end
Is this a good way to do it?
Somehow my posting from today morning neither made it to Google news
nor the mailing list. Strange…
To sum it up: yes, that’s a good way to do it. Few remarks:
You do not need + “\n” because #puts will do this already.
I prefer to open with “rb” instead of “r” in these cases. Makes
scripts more portable plus helps documenting that this is really a
binary stream.
You can preallocate the buffer, this saves a bit of GC:
File.open(path, “rb”) do |fh|
chunk = “”
while fh.read(blocksize, chunk)
outFH.puts Digest::MD5.hexdigest(chunk)
end
end
Kind regards
robert