Forum: Ruby String += vs <<

Posted by Joshua Ball (joshu)
on 2009-06-17 19:07
(Received via mailing list)
A friend recently sent me this article:
http://blog.metasploit.com/2009/03/blog-post.html

In particular, note the perf difference of += vs << :

framework3 $ time ruby -e 'a = "A"; 100000.times { a << "A" }'
>
> real 0m*0.338s*
> user 0m*0.312s*
> sys 0m0.024s
>
> framework3 $ time ruby -e 'a = "A"; 100000.times { a += "A" }'
>
> real 0m*15.462s*
> user 0m*15.321s*
> sys 0m0.068s


Also note:

*Before you run off and change every instance of += to << in your ruby 
code*,
> => "A"
> >> a << "B"
> => "AB"
> >> b
> => "AB"

>> c = "C"
> => "C"
> >> d = c
> => "C"
> >> c += "D"
> => "CD"
> >> d
> => "C"
>



Thought I would pass it along...
Posted by pat eyler (Guest)
on 2009-06-17 19:30
(Received via mailing list)
that's a nice article about some real-world benchmarking.  I wish
more people did things like this.

If you'd like a short tutorial, you can look here:
 http://on-ruby.blogspot.com/2008/12/benchmarking-makes-it-better.html

On Wed, Jun 17, 2009 at 11:06 AM, Joshua Ball<chezball@gmail.com> wrote:
>>
>> it's important to note that the two don't perform the same operation.
>> >> a << "B"
>> >> d
>> => "C"
>>
>
>
>
> Thought I would pass it along...
>



--
thanks,
-pate
-------------------------
 Don't judge those who choose to sin differently than you do

http://on-ruby.blogspot.com
http://eldersjournal.blogspot.com
Posted by Ftf 3k3 (ftf3k3)
on 2009-06-17 19:53
Joshua Ball wrote:
> A friend recently sent me this article:
> http://blog.metasploit.com/2009/03/blog-post.html
> 
> In particular, note the perf difference of += vs << :

Thanks for the tip.
Posted by Robert Dober (Guest)
on 2009-06-18 10:09
(Received via mailing list)
On Wed, Jun 17, 2009 at 7:29 PM, pat eyler<pat.eyler@gmail.com> wrote:
> that's a nice article about some real-world benchmarking.  I wish
> more people did things like this.
If you search the archives you might find a certain Robert preaching,
never to use a += b when sequences were concerned. Do I feel clever
now? No rather stupid.

Appologies for the lengthy code snippets.

Although I fully acknowledge the value of the post and that it might
be a life saver I would like to add that I pretty much have the
feeling that immutable is preferable over mutable.
And it seems that modern VMs (jruby, 1.9, ???)  kind of are written
for that programming style. I am also aware that they make micro
benchmarks like the following even less meaningless, but please
consider it just as a Whack On The Head (nonviolently of course).

---------------------------------------------------------
512/19 > cat strings.rb

N = 10_000
b = "Wassitmean"
require 'benchmark'
Benchmark.bmbm do | bench |
  a = "Ruby Rules Re Rowld"
  bench.report "+=" do
    N.times do
      a += b
    end
  end
  a = "Ruby Rules Re Rowld"
  bench.report "<<" do
    N.times do
      a += b
    end
  end
end

513/20 > jruby -v strings.rb
jruby 1.3.0 (ruby 1.8.6p287) (2009-06-06 6586) (OpenJDK Client VM
1.6.0_0) [i386-java]
Rehearsal --------------------------------------
+=   1.256000   0.000000   1.256000 (  1.191000)
<<   9.384000   0.000000   9.384000 (  9.384000)
---------------------------- total: 10.640000sec

         user     system      total        real
+=  23.397000   0.000000  23.397000 ( 23.397000)
<<  52.953000   0.000000  52.953000 ( 52.953000)

ruby 1.9.1p129 (2009-05-12 revision 23412) [i686-linux]
Rehearsal --------------------------------------
+=   0.360000   0.020000   0.380000 (  0.406038)
<<   1.040000   0.130000   1.170000 (  1.209839)
----------------------------- total: 1.550000sec

         user     system      total        real
+=   1.770000   0.230000   2.000000 (  2.056577)
<<   2.410000   0.240000   2.650000 (  3.456429)


I believe that I hit the GC in JRuby with the default settings and the
above might be an indication how performing
the short time object allocation is nowadays. Ruby1.9 has enough
memory on my machine to be that fast but still += is faster than <<.

Cheers
Robert


--
Toutes les grandes personnes ont d’abord été des enfants, mais peu
d’entre elles s’en souviennent.

All adults have been children first, but not many remember.

[Antoine de Saint-Exupéry]
Posted by Robert Dober (Guest)
on 2009-06-18 10:13
(Received via mailing list)
On Thu, Jun 18, 2009 at 10:07 AM, Robert Dober<robert.dober@gmail.com> 
wrote:

Very interesting benchmarks indeed ARRRGH
Interesting how you can make happen what you want to happen, here are
the correct results
516/23 > ruby -v strings.rb
ruby 1.9.1p129 (2009-05-12 revision 23412) [i686-linux]
Rehearsal --------------------------------------
+=   0.370000   0.010000   0.380000 (  0.459725)
<<   0.000000   0.000000   0.000000 (  0.002819)
----------------------------- total: 0.380000sec

         user     system      total        real
+=   1.800000   0.230000   2.030000 (  2.145655)
<<   0.010000   0.000000   0.010000 (  0.003220)

518/25 > jruby -v strings.rb
jruby 1.3.0 (ruby 1.8.6p287) (2009-06-06 6586) (OpenJDK Client VM
1.6.0_0) [i386-java]
Rehearsal --------------------------------------
+=   1.350000   0.000000   1.350000 (  1.283000)
<<   0.023000   0.000000   0.023000 (  0.023000)
----------------------------- total: 1.373000sec

         user     system      total        real
+=  25.738000   0.000000  25.738000 ( 25.739000)
<<   0.004000   0.000000   0.004000 (  0.004000)

No happy surprises here, and BTW if you are bored step by reading my 
posts :(

Apologies
Robert

--
Toutes les grandes personnes ont d’abord été des enfants, mais peu
d’entre elles s’en souviennent.

All adults have been children first, but not many remember.

[Antoine de Saint-Exupéry]
Posted by Marc Heiler (shevegen)
on 2009-06-18 10:52
> Ruby1.9 has enough memory on my machine to be that fast 
> but still += is faster than <<.

How should that be possible when += creates a new object
whereas << does not?
Posted by Robert Dober (Guest)
on 2009-06-18 11:08
(Received via mailing list)
On Thu, Jun 18, 2009 at 10:52 AM, Marc Heiler<shevegen@linuxmail.org> 
wrote:
>> Ruby1.9 has enough memory on my machine to be that fast
>> but still += is faster than <<.
>
> How should that be possible when += creates a new object
> whereas << does not?
Sorry please see my post above, I completely got lost.
But be careful, it could be possible indeed, object allocation in the
short living object pool could be way cheaper than copying into the
long living object pool. But my benchmark was ridiculous I should have
spotted the mistake.

Robert
> --
> Posted via http://www.ruby-forum.com/.
>
>



--
Toutes les grandes personnes ont d’abord été des enfants, mais peu
d’entre elles s’en souviennent.

All adults have been children first, but not many remember.

[Antoine de Saint-Exupéry]
Posted by Marc Heiler (shevegen)
on 2009-06-18 19:06
> But be careful, it could be possible indeed, object allocation in the
> short living object pool could be way cheaper than copying into the
> long living object pool. 
> But my benchmark was ridiculous I should have spotted the mistake.

But this statement is in conflict with what the pickaxe said years
ago about this - as far as I do remember it was said that << is
faster than +=

Now you say that this is perhaps not the case.

All I would like to to know is, if this is the case here:

Is using += faster than << for string objects?

And if it is not, how is your statement about "short living
objects" vs "long living objects" meant to be understood?

Perhaps I am a bit slow, but reading the posts above I got
the impression that you claimed that += is faster than <<
Posted by Robert Dober (Guest)
on 2009-06-18 20:54
(Received via mailing list)
On Thu, Jun 18, 2009 at 7:06 PM, Marc Heiler<shevegen@linuxmail.org> 
wrote:
Sorry for the confusion

<< is much faster than += in JRuby and YARV

All the rest was speculation which was not worth the bandwith I have
wasted, really bad, sorry.

I will stop speculating about what wonders generational GC might come
up with some day, because I am kind of confusing lots of folks, myself
being my first victim :(.

Cheers
Robert
Posted by Todd Benson (Guest)
on 2009-06-18 23:41
(Received via mailing list)
On 6/18/09, Robert Dober <robert.dober@gmail.com> wrote:
>  being my first victim :(.
>
>  Cheers
>
> Robert

One should only apologize if one did something _truly_ wrong :-)

I have a problem with current benchmarks, because I think the data
store of all the factors are limited. For example, if I get 500
different identical CPU's with one tiny difference... they have
different firmware, or different network/graphic cards, etc.; even if
you build with the same options, well -- point being, the article pat
demonstrated us is enlightening, but not all-encompassing.

I tend to think many people forget that little last bit.

Todd
Posted by Alexandre Hausen (Guest)
on 2009-06-19 20:16
(Received via mailing list)
Classical Copy&Paste bug ;-)
Lines 8 and 14 both read "a += b" in strings.rb.
However line 14 should read "a << b".


--
Alexandre
Posted by Charles Nutter (headius)
on 2009-07-04 00:04
(Received via mailing list)
On Thu, Jun 18, 2009 at 3:08 AM, Robert Dober<robert.dober@gmail.com> 
wrote:
>      a += b
>    end
>  end
>  a = "Ruby Rules Re Rowld"
>  bench.report "<<" do
>    N.times do
>      a += b
>    end
>  end
> end

Someone else noted the += in the << section, but there's another
issue: the "a" string is initialized only *once* for both rehearsal
and actual runs, since the body of the bmbm block is only executed
once to prepare the reports. If you modify it to put the a
initialization into the report blocks, it behaves more like you'd
expect. Here's a run with JRuby, with the bmbm above, "a" init fix,
"<<" fix, and 5 iterations (only last iteration shown):

Rehearsal --------------------------------------
+=   0.343000   0.000000   0.343000 (  0.343000)
<<   0.001000   0.000000   0.001000 (  0.001000)
----------------------------- total: 0.344000sec

         user     system      total        real
+=   0.343000   0.000000   0.343000 (  0.343000)
<<   0.001000   0.000000   0.001000 (  0.001000)

Here's JRuby all interpreted (no JIT compilation to bytecode):

Rehearsal --------------------------------------
+=   0.345000   0.000000   0.345000 (  0.345000)
<<   0.002000   0.000000   0.002000 (  0.002000)
----------------------------- total: 0.347000sec

         user     system      total        real
+=   0.356000   0.000000   0.356000 (  0.356000)
<<   0.002000   0.000000   0.002000 (  0.002000)

The numbers are basically the same because this bench is almost
completely limited by object allocation/GC and to a lesser extent
String performance for the two operations. But obviously << is faster
because it's growing the backing buffer for a single String rather
than creating a new one each time and copying the contents of the
previous string.

Here's the same in Ruby 1.9:

Rehearsal --------------------------------------
+=   0.260000   0.510000   0.770000 (  0.766618)
<<   0.000000   0.000000   0.000000 (  0.002294)
----------------------------- total: 0.770000sec

         user     system      total        real
+=   0.250000   0.510000   0.760000 (  0.771757)
<<   0.000000   0.000000   0.000000 (  0.002235)

This was JRuby 1.4.0dev on current Apple Java 6.

> <<  52.953000   0.000000  52.953000 ( 52.953000)
Server would perform a lot better here, but I suspect the fact that
the "a" string was never re-initialized and just kept getting bigger
was the main reason for this peculiar result.

> ruby 1.9.1p129 (2009-05-12 revision 23412) [i686-linux]
> Rehearsal --------------------------------------
> +=   0.360000   0.020000   0.380000 (  0.406038)
> <<   1.040000   0.130000   1.170000 (  1.209839)
> ----------------------------- total: 1.550000sec
>
>         user     system      total        real
> +=   1.770000   0.230000   2.000000 (  2.056577)
> <<   2.410000   0.240000   2.650000 (  3.456429)

I'm not sure why Ruby 1.9 did better here, but it could be that we
grow strings at different rates and so our strings get larger faster.
At any rate, in the fixed benchmark things look a lot better.

- Charlie
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.