Performance improvement possible?

Is it only me or is that profiling tiresome and arduous?
:slight_smile:

When I did the profile, the array processing was the biggest hit - when
I got rid of the array, I almost halved the time! Ruby arrays are
pretty cool but I think you pay for the convenience . .

But interesting to read that arrays were one performance neck.

Eleanor,

Philip R. wrote:

The change to a single Ruby script simplified the previous setup
Done - http://pastie.org/222306
think that this is equivalent to your original. Of course, all that
pointless conversions (and sample_sz of 999 would ‘overflow’ the %02d
handled:
File.open( output_filename, ‘w’ ) do |fout|

end

looks interesting - why should that be faster?

I tried that mod - it doesn’t make any difference to the speed but it
thrashes the disk because of all the syncing. The output files are very
small (~110 bytes) so it is probably not very efficient on 32,000 small
files but it might be good for big files?

Thanks,

Phil.

Philip R.

Pricom Pty Limited (ACN 003 252 275 ABN 91 003 252 275)
GPO Box 3411
Sydney NSW 2001
Australia
E-mail: [email protected]

Eleanor McHugh wrote:

Regardless of what the profiler says, any program that’s opening 32000
files and writing to them is a prime candidate for IO optimisation…

Not to mention directory fragmentation in many filesystems. :slight_smile: I’d be
looking at an ORM and SQLite rather than attempting something involving
32,000 files.

Ellie,

Eleanor McHugh wrote:

require ‘benchmark’
nested creation 0.850000 0.000000 0.850000 ( 0.851341)
unrolling 15.240000 0.280000 15.520000 ( 20.927189)

which in each case is manipulating an array of 20,000,000 elements. This
would be equivalent to processing 32000 files where each contained 625
occurrences of the ‘k=’ tag.

I haven’t used BM yet but I can’t see how doing a benchmark on an array
helps? When I had an array it was slow, when I ditched the array it was
fast?

See my other note but it didn’t make much difference.

Regardless of what the profiler says, any program that’s opening 32000
files and writing to them is a prime candidate for IO optimisation…

Yes, you have started me thinking about this and the following
processing scripts now . .

file.puts *x } } }
enumerated 34.200000 2.170000 36.370000 ( 55.614969)
direct+fsync 25.300000 2.250000 27.550000 ( 60.225069)
enum+fsync 34.520000 2.420000 36.940000 ( 98.627757)

I’d be interested in equivalent benchmarking figures for your
configuration and also an estimate of how close to the actual case this
32000x625 assumption is to your actual use case as it would help me make
sense of the doubled execution time you’re seeing with my array
modification.

I think I am giving up on writing lots of small files to disk - see
following post.

    v = stats06
    stats[[x,y]] = v if v != 0
    ...
end
File.open(output_filename, "a") do |file|
    file.puts *stats.values
end

which should pay off if the number of interesting results is much less
than the cubic search space. Obviously if you need the values to keep a
specific ordering you’d need additional logic…

The problem with using a hash is I lose the mental picture of the
situation - a two dimensional array corresponds to the physical
situation and another array in each of these cells corresponds to the
numbers of organisms. If I have time I might look at this suggestion
though.

Thanks yet again!

Regards,

Phil.

Philip R.

Pricom Pty Limited (ACN 003 252 275 ABN 91 003 252 275)
GPO Box 3411
Sydney NSW 2001
Australia
E-mail: [email protected]

On 26 Jun 2008, at 22:51, Philip R. wrote:

Eleanor McHugh wrote:

When I did the profile, the array processing was the biggest hit -
when I got rid of the array, I almost halved the time! Ruby arrays
are pretty cool but I think you pay for the convenience . .

I’m surprised at the behaviour you’re seeing so I ran a quick
benchmark on what I believe to be an equivalent array handling problem
with my laptop (OS X, Ruby 1.8.6-p111, Core 2 Duo 2GHz, 2GB RAM):

require ‘benchmark’
include Benchmark
bm(6) do |y|
y.report(“appending”) { x = []; (1…20000000).each { |i| x << i } }
y.report(“nested creation”) { x = Array.new(1000)
{ Array.new(1000) { Array.new(20, 0) } } }
y.report(“unrolling”) { x.flatten.flatten.length }
end

               user     system      total        real

appending 7.350000 0.100000 7.450000 ( 7.685458)
nested creation 0.850000 0.000000 0.850000 ( 0.851341)
unrolling 15.240000 0.280000 15.520000 ( 20.927189)

which in each case is manipulating an array of 20,000,000 elements.
This would be equivalent to processing 32000 files where each
contained 625 occurrences of the ‘k=’ tag.

See my other note but it didn’t make much difference.
Regardless of what the profiler says, any program that’s opening 32000
files and writing to them is a prime candidate for IO optimisation…

Assuming I want to write 625 single-digit numbers to 32000 files the
difference with file.puts *stats compared to individual writes is
noticeable on my setup and would more than outweigh the cost of
appending these numbers to an array. A slower processor might behave
differently, as might a system with a higher-performance drive than my
laptop’s 5400RPM SATA HDD.

x = Array.new(675, 1)
bm(6) do |y|
y.report(“direct”) { 32000.times { File.open(“test.dat”, “a”) { |
file| file.puts *x } } }
y.report(“enumerated”) { 32000.times { File.open(“test.dat”, “a”)
{ |file| x.each { |i| file.puts i } } } }
y.report(“direct+fsync”) { 32000.times { File.open(“test.dat”, “a”)
{ |file| file.sync = false; file.puts *x; file.fsync } } }
y.report(“enum+fsync”) { 32000.times { File.open(“test.dat”, “a”)
{ |file| file.sync = false; x.each { |i| file.puts i }; file.fsync } } }

end
user system total real
direct 25.160000 2.100000 27.260000 ( 38.775792)
enumerated 34.200000 2.170000 36.370000 ( 55.614969)
direct+fsync 25.300000 2.250000 27.550000 ( 60.225069)
enum+fsync 34.520000 2.420000 36.940000 ( 98.627757)

I’d be interested in equivalent benchmarking figures for your
configuration and also an estimate of how close to the actual case
this 32000x625 assumption is to your actual use case as it would help
me make sense of the doubled execution time you’re seeing with my
array modification.

The cubic array was just a direct translation of the C pointer
setup I had - basically it is a rectangular grid of sub-populations
each with an array of allele lengths.

One option is to use a Hash as a Sparse Matrix by using Arrays as keys
for indexing:

stats = Hash.new(0)
lines.each |line|

v = stats06
stats[[x,y]] = v if v != 0

end
File.open(output_filename, “a”) do |file|
file.puts *stats.values
end

which should pay off if the number of interesting results is much less
than the cubic search space. Obviously if you need the values to keep
a specific ordering you’d need additional logic…

Ellie

Eleanor McHugh
Games With Brains
http://slides.games-with-brains.net

raise ArgumentError unless @reality.responds_to? :reason

On 27 Jun 2008, at 07:55, Philip R. wrote:

{ Array.new(1000) { Array.new(20, 0) } } }
I haven’t used BM yet but I can’t see how doing a benchmark on an
array helps? When I had an array it was slow, when I ditched the
array it was fast?

Your experience runs counter to all of my experience, both in terms of
several years of Ruby hacking and more generally in working on
performance-optimisation for large data manipulation problems.
Therefore being of a curious disposition I’d love to figure out why :slight_smile:

Ellie

Eleanor McHugh
Games With Brains
http://slides.games-with-brains.net

raise ArgumentError unless @reality.responds_to? :reason

Ed, Ellie et al,

M. Edward (Ed) Borasky wrote:

Eleanor McHugh wrote:

Regardless of what the profiler says, any program that’s opening 32000
files and writing to them is a prime candidate for IO optimisation…

Not to mention directory fragmentation in many filesystems. :slight_smile: I’d be
looking at an ORM and SQLite rather than attempting something involving
32,000 files.

The dir organisation is not so bad - 20 files in each dir in tree of
50x32 dirs - however, this and previous comments from Ellie and others
have caused me to think that more of the processing chain can be
improved. Although I am incurring a 90% time increase on this current
step, the following steps have lots of awks, seds, pastes etc and it is
VERY slow. I think if I make use of DataMapper with SQLite (ie keep the
DB in memory) and then massage the data with some more Ruby I can
improve this stage still further and dramatically improve the next stage
as well! (and also keep it in one Ruby program). This is like a drug .
.

Converting the first link in the chain, the main simulation C/C++
program would be a VERY big job though and I can’t really justify the
time for that at the moment although I am thinking about it . .

Thanks to all for their help!

Regards,

Phil.

Philip R.

Pricom Pty Limited (ACN 003 252 275 ABN 91 003 252 275)
GPO Box 3411
Sydney NSW 2001
Australia
E-mail: [email protected]

2008/6/27 Eleanor McHugh [email protected]:

y.report(“unrolling”) { x.flatten.flatten.length }
end

This can’t be the exact code you were benchmarking since “x” is
undefined in the “unrolling” test. Also, why do you flatten twice?
You can’t beat an Array flatter than flatten. :slight_smile:

irb(main):009:0> Array.new(1000) { Array.new(1000) { Array.new(20, 0)
} }.flatten.select {|x| Array === x}.empty?
=> true

Regardless of what the profiler says, any program that’s opening 32000 files
and writing to them is a prime candidate for IO optimisation…

… because IO in that case is also likely the slowest bit. Absolutely.

Kind regards

robert

On 27 Jun 2008, at 12:41, Robert K. wrote:

y.report(“appending”) { x = []; (1…20000000).each { |i| x << i } }
y.report(“nested creation”) { x = Array.new(1000)
{ Array.new(1000) {
Array.new(20, 0) } } }
y.report(“unrolling”) { x.flatten.flatten.length }
end

This can’t be the exact code you were benchmarking since “x” is
undefined in the “unrolling” test.

I was using IRB and x was already defined earlier in my session, so
yes and no: for my session it was defined, but for this snippet it
won’t be.

Also, why do you flatten twice?
You can’t beat an Array flatter than flatten. :slight_smile:

I was posting at 2am >;o

Ellie

Eleanor McHugh
Games With Brains
http://slides.games-with-brains.net

raise ArgumentError unless @reality.responds_to? :reason

Ellie,

Eleanor McHugh wrote:

y.report(“nested creation”) { x = Array.new(1000) { Array.new(1000)

I haven’t used BM yet but I can’t see how doing a benchmark on an
array helps? When I had an array it was slow, when I ditched the
array it was fast?

Your experience runs counter to all of my experience, both in terms of
several years of Ruby hacking and more generally in working on
performance-optimisation for large data manipulation problems. Therefore
being of a curious disposition I’d love to figure out why :slight_smile:

I don’t think there is any great mystery here - in the first version I
thought it would be efficient to read the entire text file into an array
and then process the array, line by line. In the second case, I
eliminated the array and processed the data directly after reading it
(the same as I was doing in the first case) and relied on the Linux
system buffers/cache to be efficient about reading the data from disk.
I think the processing time saved (~50%) is just a function of not
putting data into an unnecessary structure and not manipulating the
structure?

Regards,

Phil.

Philip R.

Pricom Pty Limited (ACN 003 252 275 ABN 91 003 252 275)
GPO Box 3411
Sydney NSW 2001
Australia
E-mail: [email protected]

On 27 Jun 2008, at 16:42, Robert K. wrote:

On 27.06.2008 13:59, Eleanor McHugh wrote:

I was posting at 2am >;o

Ooops, right you are. I missed that bit completely. Hopefully you
got your sleep back by now. :slight_smile:

Mental note to self: do not start fiddling in IRB at bedtime :wink:

Ellie

Eleanor McHugh
Games With Brains
http://slides.games-with-brains.net

raise ArgumentError unless @reality.responds_to? :reason

On Jun 27, 2008, at 08:48 , Eleanor McHugh wrote:

Mental note to self: do not start fiddling in IRB at bedtime :wink:

nonsense… just don’t post your results. :wink:

On 27.06.2008 13:59, Eleanor McHugh wrote:

I was posting at 2am >;o

Ooops, right you are. I missed that bit completely. Hopefully you got
your sleep back by now. :slight_smile:

Cheers

robert

On 27.06.2008 17:48, Eleanor McHugh wrote:

On 27 Jun 2008, at 16:42, Robert K. wrote:

On 27.06.2008 13:59, Eleanor McHugh wrote:

I was posting at 2am >;o
Ooops, right you are. I missed that bit completely. Hopefully you
got your sleep back by now. :slight_smile:

Mental note to self: do not start fiddling in IRB at bedtime :wink:

:slight_smile:

robert

On 28 Jun 2008, at 10:15, Ryan D. wrote:

On Jun 27, 2008, at 08:48 , Eleanor McHugh wrote:

Mental note to self: do not start fiddling in IRB at bedtime :wink:

nonsense… just don’t post your results. :wink:

lol
like common sense and late-night hacking have ever gone together :wink:

Ellie

Eleanor McHugh
Games With Brains
http://slides.games-with-brains.net

raise ArgumentError unless @reality.responds_to? :reason