People,
In response to people’s suggestions about speeding up my script by
replacing output to many small files with output to one large file I
have implemented a hash table I can write out with YAML. However, I
find as the hash table gets larger, the script slows down . . but when I
try and work out what is happening by producing a small test script that
does more or less the same thing, I can’t reproduce the problem . .
The test script is:
#!/usr/bin/ruby
h1 = Hash.new( 0 )
srand = 0
seeds = [ ‘01’, ‘01’, ‘01’, ‘22’ ]
seeds = [ ‘01’, ‘01’, ‘20’, ‘22’ ]
seeds = [ ‘01’, ‘32’, ‘20’, ‘22’ ]
seeds = [ ‘50’, ‘32’, ‘20’, ‘22’ ]
for a in ‘01’ … seeds[0]
start = Time.now
puts a
for b in ‘01’ … seeds[1]
puts b
for c in '01' .. seeds[2]
puts c
for d in '01' .. seeds[3]
print "#{d} "
h1[ "#{a}.#{b}.#{c}.#{d}" ] = Array.new(2){ Array.new(1){
Array.new( 20, rand(1) ) } }
end
puts
end
puts
end
puts
stop = Time.now
puts stop - start
end
The script is faster with the hash insertion commented out of course and
the time between iterations of the outer loop are constant in both
scripts - but they are longer and STILL constant with the insertion not
commented out in the test script. In my actual script, when the
insertion is not commented out the time between iterations in the outer
loop gets longer and longer eg 36 sec -> a few minutes before I kill it
about half way through . .
Can anyone suggest a way of working out why the production hash
insertion behaves differently and somewhat unexpectedly?
Thanks,
Phil.
Philip R.
GPO Box 3411
Sydney NSW 2001
Australia
E-mail: [email protected]