Speed comparison: method vs. instance variable?

Hi,

We just had a short IRC discussion.

class Foo

def initialize
  @use_colours = true
end

end

Ok. Now how do you access this information?

You can:

(a) Use the instance variable @use_colours directly,
such as in:

show_colourized_result if @use_colours

(b) Or you can use an accessor (reader method) for the
variable like:

def use_colours?
@use_colours
end

show_colourized_result if use_colours?

Now the explanation was given that the variant with
use_colours? will be slower. This makes sense, the
call stack has to worked through to find and invoke
the method - the instance variable should be much
faster.

But when I asked by how much faster, nobody knew
the answer.

Anyone has an idea of a good speed comparison for
it? Obviously the code will be longer when we
use a method too.

I will try to write today evening or tomorrow evening
a comparison and then publish.

class A
attr_reader :v
def initialize() @v=33 end
def g() self.v end
def gv() @v end
end
$x=A.new

$x.g >> 163 nanoseconds
$x.gv >> 144 nanoseconds

(windows 7, core i7, Ruby 2.0.0p0)

method call have a cost not negligible, this version is
perhaps more accurate :

class A
attr_reader :v
def initialize() @v=33 end
def g()
self.v
self.v
self.v
self.v
self.v
self.v
self.v
self.v
self.v
self.v
end
def gv()
@v
@v
@v
@v
@v
@v
@v
@v
@v
@v
end
end
$x=A.new

$x.g >> 367 nanoseconds
$x.gv >> 182 nanoseconds

68% best for @v

(windows 7, core i7, Ruby 2.0.0p0)

Regis d’Aubarede wrote in post #1164862:

$x.g >> 163 nanoseconds
$x.gv >> 144 nanoseconds

How did you measure?

Here’s my approach:

$ ruby x.rb
10000000
99
99
Rehearsal -------------------------------------------
get_var 1.120000 0.000000 1.120000 ( 1.120600)
get_acc 1.630000 0.000000 1.630000 ( 1.631967)
---------------------------------- total: 2.750000sec

          user     system      total        real

get_var 1.130000 0.000000 1.130000 ( 1.128216)
get_acc 1.630000 0.000000 1.630000 ( 1.633235)
$ cat -n x.rb
1
2 require ‘benchmark’
3
4 class Foo
5 attr_accessor :bar
6 def initialize; self.bar = 99 end
7 def get_var; @bar end
8 def get_acc; bar end
9 end
10
11 REP = 10_000_000
12
13 f = Foo.new
14 puts REP, f.get_var, f.get_acc
15
16 Benchmark.bmbm do |x|
17 x.report ‘get_var’ do REP.times { f.get_var } end
18 x.report ‘get_acc’ do REP.times { f.get_acc } end
19 end
20
$ ruby -v
ruby 1.9.3p484 (2013-11-22 revision 43786) [x86_64-linux]

Robert K. wrote in post #1164915:

16  Benchmark.bmbm do |x|
17    x.report 'get_var' do REP.times { f.get_var } end
18    x.report 'get_acc' do REP.times { f.get_acc } end
19  end

in this code, you measure Fixnum#times and f#get_var call.

Regis d’Aubarede wrote in post #1164862:

$x.g >> 163 nanoseconds
$x.gv >> 144 nanoseconds

How did you measure?

My own tool, ‘benchi’ from Ruiby dsl gem (samples directory).
You can’t use it: gtk3 is curently broken with gem 2.2.3.

benchi is build for abstract the repetition and the call from measure.

Regis d’Aubarede wrote in post #1164918:

Robert K. wrote in post #1164915:

16  Benchmark.bmbm do |x|
17    x.report 'get_var' do REP.times { f.get_var } end
18    x.report 'get_acc' do REP.times { f.get_acc } end
19  end

in this code, you measure Fixnum#times and f#get_var call.

Yes, and REP.times costs the same for both approaches so you can
calculate the overhead of the accessor from that.

I do that to get measurements in a larger time unit and to avoid other
effects influencing one time call measurement. For completeness reasons
we can add an empty block measurement and get:

$ ruby x.rb
10000000
99
99
Rehearsal --------------------------------------------
baseline 0.650000 0.000000 0.650000 ( 0.653097)
get_var 1.070000 0.000000 1.070000 ( 1.065626)
get_acc 1.660000 0.000000 1.660000 ( 1.662289)
----------------------------------- total: 3.380000sec

           user     system      total        real

baseline 0.570000 0.000000 0.570000 ( 0.572243)
get_var 1.060000 0.000000 1.060000 ( 1.060918)
get_acc 1.670000 0.000000 1.670000 ( 1.661607)

So it’s 0.49s vs. 1.1s. That are 110ns per call worst case and 61ns
overhead for accessor method.

benchi is build for abstract the repetition and the call from measure.

How do you make sure you do not measure the repetition loop?

Regis d’Aubarede wrote in post #1164977:

Robert K. wrote in post #1164946:

benchi is build for abstract the repetition and the call from measure.

How do you make sure you do not measure the repetition loop?

by repeating the code in a procedure and repeating procedure call in a
loop:

def tst() code ; code ; … end
loop { tst ; tst ; tst … }

Do I understand that correctly that you have two levels of repetition
here? And how do you stop the infinite loop?

and we measure the loop with an empty code

def tst() nil ; nil ; nil … end

Then, for testing , we verify that the measure increases linearly
by repeating the code to test

In what ways is that better than my approach? At the moment it just
seems more complicated but maybe I am overlooking something.

Cheers

robert

Robert K. wrote in post #1164946:

benchi is build for abstract the repetition and the call from measure.

How do you make sure you do not measure the repetition loop?

by repeating the code in a procedure and repeating procedure call in a
loop:

def tst() code ; code ; … end
loop { tst ; tst ; tst … }

and we measure the loop with an empty code

def tst() nil ; nil ; nil … end

Then, for testing , we verify that the measure increases linearly
by repeating the code to test

Regards,
Regis

Robert K. wrote in post #1165068:

In what ways is that better than my approach? At the moment it just
seems more complicated but maybe I am overlooking something.

Cheers

robert

D:\usr>ruby t.rb

require ‘benchmark’

class Foo
attr_accessor :bar
def initialize; self.bar = 99 end
def get_var; @bar end
def get_var2; @bar + @bar end
def get_var4; @bar +@bar +@bar +@bar end
def get_var10; @bar +@bar +@bar +@bar +@bar +@bar +@bar +@bar
+@bar +@bar end
def get_acc; bar end
end
puts “======================”
puts File.read($0)
puts “======================”

REP = 10_000_000

f = Foo.new
puts REP, f.get_var, f.get_acc

Benchmark.bmbm do |x|
x.report ‘get_var’ do REP.times { f.get_var } end
x.report ‘get_var2’ do REP.times { f.get_var2 } end
x.report ‘get_var4’ do REP.times { f.get_var4 } end
x.report ‘get_var10’ do REP.times { f.get_var10 } end
end

10000000
99
99
Rehearsal ---------------------------------------------
get_var 1.685000 0.000000 1.685000 ( 1.684803)
get_var2 1.934000 0.000000 1.934000 ( 1.965603)
get_var4 2.839000 0.000000 2.839000 ( 2.857806)
get_var10 4.524000 0.000000 4.524000 ( 4.544608)
----------------------------------- total: 10.982000sec

            user     system      total        real

get_var 1.763000 0.000000 1.763000 ( 1.778403)
get_var2 1.966000 0.000000 1.966000 ( 1.999804)
get_var4 2.839000 0.000000 2.839000 ( 2.870405)
get_var10 4.539000 0.000000 4.539000 ( 4.570808)

D:\usr>

get_var10 is only 3 times slower than get_var !

Regis d’Aubarede wrote in post #1165078:

Robert K. wrote in post #1165068:

In what ways is that better than my approach? At the moment it just
seems more complicated but maybe I am overlooking something.

get_var10 is only 3 times slower than get_var !

I fail to see how your example answers even the question of mine that
you quoted - let alone the other ones. The point is that you seem to
employ a more complicated scheme to measure performance than the simple
one I used and I would like to know what the advantage is. After all
Robert tries to decide what access strategy to member state is more
efficient and now how many member variable and plus operations shadow
the cost of a method call.

Robert K. wrote in post #1165134:

Regis d’Aubarede wrote in post #1165078:

Robert K. wrote in post #1165068:

In what ways is that better than my approach? At the moment it just
seems more complicated but maybe I am overlooking something.

get_var10 is only 3 times slower than get_var !

I fail to see how your example answers even the question of mine that
you quoted

My exemple show that the measure with a simple n.times { code }
is not lineare with code repetition :
if code is repeated n times, measure is not multiply by n.

The point is that you seem to employ a more complicated scheme to
measure performance than the simple one I used and I would like to know
what the advantage is. After all

with my methods, the measure is linear with repetition, I conclude that
it is better…

Robert tries to decide what access strategy to member state is more
efficient

responses has been given:
your measures : get / at ==> 163 ns / 112 ns
my measures : get / at ==> 367 ns / 182 ns

and now how many member variable and plus operations shadow
the cost of a method call.

I add the plus because it’s seem that ruby 2.1 has some new optimization
:
@v;@v;@v;@v;@v;@v;@v;@v;@v;@v
take almost same time than @v

let alone the other ones (questions)…

And how do you stop the infinite loop?
How do you make sure you do not measure the repetition loop?

My code is :

def tst() code;code;… end # 100 times same code

def tst0(call_time)
t=Time.now.to_f
target=t+10
i=0
while Time.now.to_f<target
tst();tst(); … # 400 times
i+=1
end
((Time.now.to_f - t)1000_000_000 - call_timei1000)/(i400*100))
end

Time.now.to_s take ~ 500 ns and call tst() ( call_time ) has be measured
with an empty code.

This solution is complex because I try to get a measure exact.It is
difficult when the code to test is very simple.

Then we should use an ARM configuration and a oscilloscope for check
the exact timing :slight_smile:

Hoping to have answered all your questions…
Regards,

Regis

(sorry for my English, I prefer Ruby language…)

Robert K. wrote in post #1165156:

This gives me an idea for a feature request. :slight_smile:

Regis d’Aubarede wrote in post #1165145:

Robert K. wrote in post #1165134:

Regis d’Aubarede wrote in post #1165078:

Robert K. wrote in post #1165068:

In what ways is that better than my approach? At the moment it just
seems more complicated but maybe I am overlooking something.

get_var10 is only 3 times slower than get_var !

I fail to see how your example answers even the question of mine that
you quoted

My exemple show that the measure with a simple n.times { code }
is not lineare with code repetition :
if code is repeated n times, measure is not multiply by n.

Maybe I did not get what you meant with “linear” here. I think now you
mean “linear in the number of repetitions”: With

require ‘benchmark’

REP = 10.times.map {|i| 10**i}

Benchmark.bmbm do |x|
REP.each do |rep|
x.report “%10d” % rep do
rep.times {}
end
end
end

I get

$ ./x.rb
Rehearsal ----------------------------------------------
1 0.000000 0.000000 0.000000 ( 0.000007)
10 0.000000 0.000000 0.000000 ( 0.000005)
100 0.000000 0.000000 0.000000 ( 0.000012)
1000 0.000000 0.000000 0.000000 ( 0.000095)
10000 0.010000 0.000000 0.010000 ( 0.000916)
100000 0.000000 0.000000 0.000000 ( 0.011101)
1000000 0.060000 0.000000 0.060000 ( 0.060148)
10000000 0.600000 0.000000 0.600000 ( 0.596774)
100000000 5.970000 0.000000 5.970000 ( 5.967721)
1000000000 59.540000 0.010000 59.550000 ( 59.558273)
------------------------------------ total: 66.190000sec

             user     system      total        real
     1   0.000000   0.000000   0.000000 (  0.000003)
    10   0.000000   0.000000   0.000000 (  0.000003)
   100   0.000000   0.000000   0.000000 (  0.000008)
  1000   0.000000   0.000000   0.000000 (  0.000059)
 10000   0.000000   0.000000   0.000000 (  0.000569)
100000   0.010000   0.000000   0.010000 (  0.005671)

1000000 0.060000 0.000000 0.060000 ( 0.057733)
10000000 0.570000 0.000000 0.570000 ( 0.570455)
100000000 5.710000 0.000000 5.710000 ( 5.716101)
1000000000 57.120000 0.000000 57.120000 ( 57.122374)

That looks pretty linear to me.

The point is that you seem to employ a more complicated scheme to
measure performance than the simple one I used and I would like to know
what the advantage is. After all

with my methods, the measure is linear with repetition, I conclude that
it is better…

see above

and now how many member variable and plus operations shadow
the cost of a method call.

I add the plus because it’s seem that ruby 2.1 has some new optimization
:
@v;@v;@v;@v;@v;@v;@v;@v;@v;@v
take almost same time than @v

For me that would be one more reason to not do this.

let alone the other ones (questions)…

And how do you stop the infinite loop?
How do you make sure you do not measure the repetition loop?

My code is :

def tst() code;code;… end # 100 times same code

def tst0(call_time)
t=Time.now.to_f
target=t+10
i=0
while Time.now.to_f<target
tst();tst(); … # 400 times
i+=1
end
((Time.now.to_f - t)1000_000_000 - call_timei1000)/(i400*100))
end

Why do you convert the time to a float? At least for the loop (variable
“target”) that is superfluous because the precision is in the Time
instance already:

irb(main):005:0> 10.times.map {Time.now}.each_cons(2){|a,b| p b-a}
1.955e-06
1.048e-06
9.78e-07
1.327e-06
9.77e-07
9.78e-07
9.78e-07
9.08e-07
9.78e-07
=> nil
irb(main):006:0> 10.times.map {Time.now}.each_cons(2){|a,b| p a<b}
true
true
true
true
true
true
true
true
true

Time.now.to_s take ~ 500 ns and call tst() ( call_time ) has be measured
with an empty code.

I achieved the same with REP { } (i.e. empty block): with that I
calculated the overhead of the looping and subtracted that from the
other measurements.

This solution is complex because I try to get a measure exact.It is
difficult when the code to test is very simple.

So with that complexity you are introducing an error of its own. :slight_smile:

Then we should use an ARM configuration and a oscilloscope for check
the exact timing :slight_smile:

Any measurement in a GCed language has some level of uncertainty because
one does not control when GC kicks in. Even your oscilloscope won’t
help with that, I guess.

Hoping to have answered all your questions…
Regards,

Most of them, I think. I still think the complexity introduced is not
outweighed by the advantages. You have to create a baseline with empty
code as well - the same way as I do.

Btw. you can fairly easy automate that because you can calculate with
Benchmark::Tms:

irb(main):026:0> t1=nil;Benchmark.bm {|x| t1 = x.report(“a”){sleep 1}}
user system total real
a 0.000000 0.000000 0.000000 ( 1.000098)
=> [ 0.000000 0.000000 0.000000 ( 1.000098)
]
irb(main):027:0> t2=nil;Benchmark.bm {|x| t2 = x.report(“a”){sleep 1.2}}
user system total real
a 0.000000 0.000000 0.000000 ( 1.200099)
=> [ 0.000000 0.000000 0.000000 ( 1.200099)
]
irb(main):028:0> t2 - t1
=> 0.000000 0.000000 0.000000 ( 0.200001)

irb(main):029:0> (t2 - t1).class
=> Benchmark::Tms

This gives me an idea for a feature request. :slight_smile:

(sorry for my English, I prefer Ruby language…)

:slight_smile: