I’m wondering if anyone knows much about Ruby’s efficiency with IO#read.
Specifically, I’m wondering about libraries I might use to speed up disk
To see what I mean, here’s some test code that iterates over an
11-megabyte file. All it does is call IO#read on a number of bytes (set
on the command-line) over the entire file, and times it.
buf_size = ARGV.to_i
fd = File.open(“some.txt”)
start = Time.now
stop = Time.now
puts (stop - start).to_s + " seconds"
Running this on my system yields:
$ ruby readspeed.rb 4096
$ ruby readspeed.rb 1
Obviously a big difference! This is a simplified version of the test I
was actually running, which tried to account for the increased amount of
overhead when calling with 1 byte at a time. There’s still an
order-of-magnitude difference between the two…reading one byte at a
time is slow, slow enough to bog down an entire program.
I know this is supposed to be the case with unbuffered input, such as
the C standard library “read”, but isn’t IO#read supposed to be
buffered? What’s causing this slowdown? I’m writing a class that will
hopefully speed up smaller reads from binary files by explicitly caching
data in memory, but I’m wondering if there are any pre-built (i.e.,
tested) solutions that Ruby programmers might be using.