I noticed some strange issues in an application I’m working on where it
works speedy at first, and then gets progressively slower and slower.
After
a little bit of investigation, it seems that Enumerable#find is the
culprit. The more times you run a find statement the slower it gets.
Does
anybody have any ideas why this is?
Here’s a sample that demonstrates the issue:
require ‘mscorlib’
require ‘C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\System.dll’
a = (1…1000).to_a
s = System::Diagnostics::Stopwatch.new()
100.times {
s.start
a.find {|x| x == 1000}
s.stop
puts “find took: #{s.elapsed.ToString}”
}
which outputs:
find took: 00:00:00.0743446
…snip…
find took: 00:00:00.2290894
…snip…
find took: 00:00:00.4942105
I understand you guys aren’t optimizing yet, but I thought you might be
interested to see growth where there shouldn’t be any. Also, if you
have
any ideas off the top of your head, I wouldn’t mind looking into it a
little.
-Eric
Whoops, had the stopwatch setup wrong in that last message
require ‘mscorlib’
require ‘C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\System.dll’
a = (1…1000).to_a
1000.times {
s = System::Diagnostics::Stopwatch.new()
s.start
a.find {|x| x == 1000}
s.stop
puts “find took: #{s.elapsed.ToString}”
}
and the output is:
find took: 00:00:00.0732210
find took: 00:00:00.0048562
find took: 00:00:00.0022469
find took: 00:00:00.0021351
… snip …
find took: 00:00:00.0196016
find took: 00:00:00.0201994
find took: 00:00:00.0218765
… snip …
find took: 00:00:00.0371834
find took: 00:00:00.0367817
find took: 00:00:00.0361911
On Wed, 06 Feb 2008 12:38:46 -0700, Eric N.
[email protected]
wrote:
puts "find took: #{s.elapsed.ToString}"
}
You’re reusing the same Stopwatch instance, so you have to add a call to
the Reset method at the end of each iteration. In doing so you should
find your timings are as they should be.
–
/M:D
M. David P.
Co-Founder & Chief Architect, 3rd&Urban, LLC
Email: [email protected] | [email protected]
Mobile: (206) 418-9027
http://3rdandUrban.com | http://amp.fm |
http://www.oreillynet.com/pub/au/2354
He’s right about the issue however:
find took: 00:00:00.0081136
find took: 00:00:00.0022516
find took: 00:00:00.0021312
[snip]
find took: 00:00:00.0220530
find took: 00:00:00.0245721
find took: 00:00:00.0238102
[snip]
find took: 00:00:00.0377458
find took: 00:00:00.0453373
find took: 00:00:00.0369415
On Feb 6, 2008 3:13 PM, M. David P. [email protected] wrote:
s.start
Ironruby-core mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ironruby-core
–
Michael L.
[Polymath Programmer]
http://michaeldotnet.blogspot.com
On Wed, 06 Feb 2008 13:23:10 -0700, Michael L.
[email protected] wrote:
He’s right about the issue however:
…
find took: 00:00:00.0377458
find took: 00:00:00.0453373
find took: 00:00:00.0369415
Whoops. My bad. I was testing via Ruby.NET which, coincidentally,
doesn’t showcase this type of behavior,
find took: 00:00:00.0216940
find took: 00:00:00.0012342
find took: 00:00:00.0012423
…
find took: 00:00:00.0013994
find took: 00:00:00.0013016
find took: 00:00:00.0017697
…
find took: 00:00:00.0012989
find took: 00:00:00.0014228
find took: 00:00:00.0013389
…
I guess maybe Ruby.NET still has its place in this world afterall.
(the modified code in case anyone wants to run the comparison on their
own
machine)
require ‘mscorlib’
require ‘C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\System.dll’
a = (1…1000).to_a
s = System::Diagnostics::Stopwatch.new()
100.times {
s.Start
a.find {|x| x == 1000}
s.Stop
puts “find took: #{s.Elapsed}”
s.Reset
}
–
/M:D
M. David P.
Co-Founder & Chief Architect, 3rd&Urban, LLC
Email: [email protected] | [email protected]
Mobile: (206) 418-9027
http://3rdandUrban.com | http://amp.fm |
http://www.oreillynet.com/pub/au/2354
Michael L.:
m = Proc.new {|x| x == 1000 }
find took: 00:00:00.0012915
find took: 00:00:00.0141006
find took: 00:00:00.0106432
This is a strange; nothing obvious jumps out right now. We’ll have to
look at this down the road when we do our perf push. Clearly this is a
bug. Can someone open up a bug for this?
Thanks,
-John
I have some ideas on what some of the issues may be, primarily related
to how the Enumerable class creates a new Proc object every time
through the loop. Anyway, doing the following seems to help to some
extent:
require ‘mscorlib’
require ‘C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\System.dll’
a = (1…1000).to_a
s = System::Diagnostics::Stopwatch.new()
m = Proc.new {|x| x == 1000 }
1000.times {
s.start
y = a.find &m
s.stop
puts “find took: #{s.elapsed.ToString}”
s.reset
}
find took: 00:00:00.0012998
find took: 00:00:00.0012719
find took: 00:00:00.0012915
[snip]
find took: 00:00:00.0049048
find took: 00:00:00.0052548
find took: 00:00:00.0051403
[snip]
find took: 00:00:00.0127650
find took: 00:00:00.0141006
find took: 00:00:00.0106432
You can see the times still going up (rules are being created based on
Proc.Id?) but not nearly the same pace.
On Feb 6, 2008 3:31 PM, M. David P. [email protected] wrote:
doesn’t showcase this type of behavior,
find took: 00:00:00.0014228
require ‘mscorlib’
s.Reset
http://3rdandUrban.com | http://amp.fm |
http://www.oreillynet.com/pub/au/2354
Ironruby-core mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ironruby-core
–
Michael L.
[Polymath Programmer]
http://michaeldotnet.blogspot.com
So “Enumerable#find” is definitely a red herring. I’m guessing it has
something to do with block creation. Here’s a better example without
the
find:
10000.times { | iter |
s.reset
s.start
x = 0
1000.times { x += 1}
s.stop
puts “iteration #{iter} took: #{s.elapsed.ToString}” if iter % 100
== 0
}
Output:
iteration 0 took: 00:00:00.0471160
iteration 100 took: 00:00:00.0024187
iteration 200 took: 00:00:00.0037267
iteration 300 took: 00:00:00.0039901
iteration 400 took: 00:00:00.0054012
iteration 500 took: 00:00:00.0063639
iteration 600 took: 00:00:00.0070567
iteration 700 took: 00:00:00.0103230
iteration 800 took: 00:00:00.0093428
iteration 900 took: 00:00:00.0095257
[snip]
iteration 9700 took: 00:00:00.0892543
iteration 9800 took: 00:00:00.0899337
iteration 9900 took: 00:00:00.0963234
It’s very consistent linear growth. Several iterations took much
longer
presumably due to garbage collection.
Compare that to this version without the inner block:
1000.times do |iter|
s.reset
s.start
x = 0
x+= 1 while x < 1000
s.stop
puts “iteration #{iter} took: #{s.elapsed.ToString}” if iter % 100
== 0
iter += 1
end
Output:
iteration 0 took: 00:00:00.1119421
iteration 100 took: 00:00:00.0001190
iteration 200 took: 00:00:00.0001430
iteration 300 took: 00:00:00.0001326
iteration 400 took: 00:00:00.0001212
iteration 500 took: 00:00:00.0001173
iteration 600 took: 00:00:00.0001190
iteration 700 took: 00:00:00.0001178
iteration 800 took: 00:00:00.0001176
iteration 900 took: 00:00:00.0001178
[snip]
iteration 9900 took: 00:00:00.0001142
So maybe block creation/parsing/execution is a little slower each time?
I don’t know if i’d call it a “red herring” as the performance with
find dropped considerably faster, it’s probably a combination.
On Feb 6, 2008 6:56 PM, Eric N. [email protected] wrote:
puts "iteration #{iter} took: #{s.elapsed.ToString}" if iter % 100 == 0
iteration 700 took: 00:00:00.0103230
Compare that to this version without the inner block:
iteration 800 took: 00:00:00.0001176
Bug has been submitted. I’ll probably poke it with a stick some while
to how the Enumerable class creates a new Proc object every time
s.start
[snip]
[email protected]
Ironruby-core mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ironruby-core
–
Michael L.
[Polymath Programmer]
http://michaeldotnet.blogspot.com
Bug has been submitted. I’ll probably poke it with a stick some while
I’m at it.
On Feb 6, 2008 5:58 PM, John L. (DLR) [email protected] wrote:
[snip]
Ironruby-core mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ironruby-core
–
Michael L.
[Polymath Programmer]
http://michaeldotnet.blogspot.com
Michael L.:
I don’t know if i’d call it a “red herring” as the performance with
find dropped considerably faster, it’s probably a combination.
Probably because Enumerable#find uses a block internally when it calls
#each, so you get two block dispatches instead of one.
The example with a nested times call also consists of two block
dispatches and doesn’t ramp up nearly as bad, or am I missing
something?
On Feb 6, 2008 7:41 PM, John M. [email protected] wrote:
Ironruby-core mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ironruby-core
–
Michael L.
[Polymath Programmer]
http://michaeldotnet.blogspot.com