Forum: Ruby Benchmark obsession?

Posted by Jan E. (jacques1)
on 2012-06-20 16:43
Hi,

After having read this list for a while, I wonder why some of you put so
much weight on speed optimizations. I'm not talking about big things
that really make sense but small stuff like "Don't use symbols, they
can't be garbage collected", "Don't concatenate strings, use string
interpolation instead", "Don't use Enumerable#inject to build up
objects" etc. etc.

In my opinion, this is like trying to get a classic car faster. It just
makes no sense. Ruby isn't about speed, it's about elegance and clarity.
If you're looking for speed, you've got the wrong language. Use C or
whatever.

I'd always prefer an elegant solution over a fast one. For example, I
love functional style programming with Enumerable#inject. I don't care
if it's some milliseconds slower than assigning the values to a
variable.
Posted by Bartosz Dziewoński (matmarex)
on 2012-06-20 17:23
(Received via mailing list)
Speed does sometimes matter, even in this kind of micro-benchmarks.
Maybe you're writing a JSON processor, maybe a parser, maybe a math
library. Of course, most times you don't, and you shouldn't care about
being a millisecond faster.

And now I can't not comment on the three examples you brought up.

"Don't use symbols, they can't be garbage collected" – this sounds
like someone who doesn't know what the hell they are doing would say.
If there is *ever* a case where you have created enough symbols to
create a visible memory footprint, you are doing something extremely
wrong. Using symbols where you would use magic constants or enums in
language like C, or as "static" (as in, not dynamically generated, or
limited to a certain number – for example columns of a table in a
database) keys of a hash is perfectly okay and it is near impossible
for this to cause problems with GC (symbol's text is internally only
stored *once*, and symbols are special-cased to avoid memory
indirection in C code and passed around as fake-pointers; only other
class treated like this is Fixnum). In fact, this can lead to a faster
code when you replace constant strings with symbols. (Disclaimer: I
didn't benchmark this.)

"Don't concatenate strings, use string interpolation instead" – I'd
say that in most cases string interpolation is clearer than
concatenating, especially when a lot of ".to_s" calls would have to be
used.

"Don't use Enumerable#inject to build up objects" – I am morally
opposed to using #inject for anything else that actually folding an
array into a single value, which is what is was intended for.
Injecting with an object, as clever as it is, is not clear code.

-- Matma Rex
Posted by Robert Klemme (robert_k78)
on 2012-06-20 18:02
(Received via mailing list)
On Wed, Jun 20, 2012 at 4:43 PM, Jan E. <lists@ruby-forum.com> wrote:
> After having read this list for a while, I wonder why some of you put so
> much weight on speed optimizations.

I can't say I have observed an _obsession_ with things like these.
Maybe it's just the fun or that it is so easy to Benchmark. :-)

> I'm not talking about big things
> that really make sense but small stuff like "Don't use symbols, they
> can't be garbage collected",

That is certainly a stupid general rule because in some situations
this is exactly what you want: identifiers which do not need to be
created and which are not GC'ed.

> "Don't concatenate strings, use string
> interpolation instead", "Don't use Enumerable#inject to build up
> objects" etc. etc.

As I said, I don't have made the same observation you apparently have
made.  I do not read most of the traffic here but if it really was
obsessive I think I would have noticed.  Strange how perceptions can
be so different.

Cheers

robert
Posted by Bill Paulson (wpaulson)
on 2012-06-20 18:43
(Received via mailing list)
I'd say more like a "common fallacy" than an obsession. Indeed, it's a 
common problem across all languages, not just Ruby. I recall advice in 
Basic that said "don't use new lines unless you need a label." In thirty 
years of coding and performance evaluation, I can't recall a case where 
micro-tuning was sufficient to solve a performance issue, yet there are 
many times I've seen it used.

Sent from my iPhone
Posted by Tony Arcieri (Guest)
on 2012-06-20 19:01
(Received via mailing list)
On Wed, Jun 20, 2012 at 7:43 AM, Jan E. <lists@ruby-forum.com> wrote:

> After having read this list for a while, I wonder why some of you put so
> much weight on speed optimizations. I'm not talking about big things
> that really make sense but small stuff like "Don't use symbols, they
> can't be garbage collected", "Don't concatenate strings, use string
> interpolation instead", "Don't use Enumerable#inject to build up
> objects" etc. etc.
>

See http://twitter.com/roflscaletips

In my opinion, this is like trying to get a classic car faster. It just
> makes no sense. Ruby isn't about speed, it's about elegance and clarity.
> If you're looking for speed, you've got the wrong language. Use C or
> whatever.


There are some very egregious things that libraries can do which they
shouldn't and will significantly affect the performance of all running
code, like altering the class hierarchy at runtime and thus invalidating
all method caches at all call sites. This is bad and people should call
that thing out ("DCI" people, I'm looking at you...)

However, when it comes to microoptimizing your Ruby code like that, you
should probably be using something like perftools to measure. Code has
different performance characteristics in different scenarios, so unless 
you
have some real-world code you're trying to make faster, it's kind of a
pointless exercise. If you do have said code, you should optimize it in 
a
data-driven way. The best speedups you get will probably be from using 
code
with better algorithmic properties and not from microoptimizing 
minutiae.
Posted by Dan Connelly (djconnel)
on 2012-06-20 19:04
The real question is: how many microtunes does it take for the advantage 
to offset the cost of one additional bug?

A: a lot.

Clarity and simplicity is almost always the best approach, I feel.
Posted by pat eyler (Guest)
on 2012-06-20 21:08
(Received via mailing list)
On Wed, Jun 20, 2012 at 11:04 AM, Dan Connelly <lists@ruby-forum.com> 
wrote:
> The real question is: how many microtunes does it take for the advantage
> to offset the cost of one additional bug?
>
> A: a lot.

or a micro optimization that makes some that happens many, many times 
faster.

if performance is not acceptable, and profiling indicates a spot that
needs fixing, a micro-optimization could be the right thing to do.

>
> Clarity and simplicity is almost always the best approach, I feel.

Clarity, simplicity, and profiling if/when you run into problems.
Posted by Ryan Davis (Guest)
on 2012-06-20 23:57
(Received via mailing list)
On Jun 20, 2012, at 07:43 , Jan E. wrote:

I kinda feel like I'm being called out as I'm on record (many times) for 
2/3rds of your examples so I'll address them specifically:

> "Don't concatenate strings, use string interpolation instead"

Using a recent real example from this list where I suggested 
interpolation:

    m.ClassMethodString + " " + m.ClassMethodString + ": " + 
m.ClassMethodInteger.to_s

mmmmm java code.

1) slower        -- several more method calls
2) wasteful      -- creates much more garbage
3) longer/uglier -- I'd argue that it is much less elegant

vs

    "#{m.ClassMethodString} #{m.ClassMethodString}: 
#{m.ClassMethodInteger}"

1) clarity   -- it's just a string.
2) elegance  -- it's JUST a string AND you don't need those stupid #to_s 
calls.
3) efficient -- takes less time, uses less memory, makes less garbage, 
and even easier to read.

> "Don't use Enumerable#inject to build up objects" etc. etc.

Given #inject's other alias, #reduce, it is obvious that you don't use 
#inject for building up other objects. Even in a functional style of 
programming you'd _never_ see it building up anything. You'd see it 
REDUCING (folding) an object. If #inject is applied in a non-folding 
manner, it isn't functional, it is just dumb. Don't pretend otherwise 
(and if you do pretend otherwise, go read more books on lisp--start with 
SICP). The second I see a semicolon (or return) in an inject, I 
immediately suspect that someone is writing clevar/stupid code.

I don't have any recent examples from the list, but I'm on record in 
multiple mediums ranting against people who use #inject improperly. I'll 
make up one based on examples I've seen time and time again:

    return im_a_lazy_bastard.inject(Hash.new 0) { |h, o| 
h[o.really_really_lazy] += 1; h }

vs

    counter = Hash.new 0
    thingies.each do |o|
      counter[o.key] += 1
    end
    return counter

1) I use #each because it adds CLARITY. I want to enumerate each 
element. I'm not folding anything.
2) Yes, it's faster. I don't actually care about that nearly as much as 
#1.
3) Yes, it is more lines:
   1) but only if you write the inject version that way.
   2) I use the Weirich Method [1][2] of choosing {} vs do/end. 
INCREASING clarity and intent.
   3) each line is a stand-alone concept that helps increase clarity.

Here is a perfect example of an actual folding application of #inject:

    classname.split(/::/).inject(Object) { |k, n| k.const_get n }

vs:

    k = Object
    classname.split(/::/).each { |n| k = k.const_get n }
    k

As you can see, the #inject version is incredibly clear and concise. The 
second example takes longer to figure out. That is what the natural fit 
of a well designed method is supposed to do.

---

Come to think of it (!!!) I DO have a real world example of inject that 
I used in my Ruby Sadism talk:

if MODELS.keys.inject(true) {|b, klass| b and 
klass.constantize.columns.map(&:name).include? 
association.options[:foreign_key]} then
  # ...
end

Have fun with that... It's probably the most egregious use of inject 
I've ever found. The original author actually argued that he wrote it 
that way "for maintainability".

[1]: http://onestepback.org/index.cgi/Tech/Ruby/BraceVsDoEnd.rdoc
[2]: 
http://talklikeaduck.denhaven2.com/2007/10/02/ruby...
Posted by Jan E. (jacques1)
on 2012-06-21 00:51
Ryan Davis wrote in post #1065406:
> I kinda feel like I'm being called out as I'm on record (many times) for
> 2/3rds of your examples so I'll address them specifically:

Well, it seems those examples were a bit ambiguous. I'm *not* arguing
against string interpolation etc. I'm against using them for the sole
purpose of saving some bytes and CPU cycles.

If you use string interpolation for clarity, that's perfect. I fully
agree with you. The same goes for inject (with "building up an object" I
actually meant "building up the aggregate value" -- so there's no
disagreement on that).

My point is that we should focus on readability, clarity, elegance etc.
rather than do everything to make our programs run a bit faster. That's
just not what Ruby is for (at least to my understanding).
Posted by Dan Connelly (djconnel)
on 2012-06-21 01:44
Jan E. wrote in post #1065412:

> If you use string interpolation for clarity, that's perfect. I fully
> agree with you. The same goes for inject (with "building up an object" I
> actually meant "building up the aggregate value" -- so there's no
> disagreement on that).

If I have a choice between writing:

puts "a = #{a}\n"\
     "b = #{b}"

and

$stdout << "a = " << a << "\nb = " << b << "\n"

which is clearer?  C++ fans might prefer the second, while I prefer the 
first.  In any case, I'm glad to hear the first happens to be faster, as 
well :).
Posted by Avdi Grimm (Guest)
on 2012-06-21 02:50
(Received via mailing list)
On Jun 20, 2012 5:57 PM, "Ryan Davis" <ryand-ruby@zenspider.com> wrote:
>
>
>
>    counter = Hash.new 0
>    thingies.each do |o|
>      counter[o.key] += 1
>    end
>    return counter
>

I'm curious if you consider #each_with_object a reasonable choice for 
this.
Posted by Sam Duncan (Guest)
on 2012-06-21 03:08
(Received via mailing list)
On 21/06/12 12:49, Avdi Grimm wrote:
> >    end
> >    return counter
> >
>
> I'm curious if you consider #each_with_object a reasonable choice for
> this.
>
> --
> Avdi
>
Is that basically the same thing wrapped in another method so that
counter and o are yielded to a block?

def each_with_object(memo)
   return to_enum :each_with_object, memo unless block_given?
   each do |element|
     yield element, memo
   end
   memo
end

Sam
Posted by Matthew Kerwin (mattyk)
on 2012-06-21 04:43
(Received via mailing list)
I'll add to the string interpolation issue with an anecdote: I've had
real world examples (in projects about which I'm forbidden to talk for
legal reasons) where a refactoring from:

  foo = "a" + b + "c"

type string assembly to:

  foo = ""; foo << "a" << b << "c"

caused an immense speedup (we're talking tens of minutes here),
reduced the memory footprint dramatically, and generally made our
lives on the floor that little bit easier.  Of course for something
like my abc example above I'd definitely use "a#{b}c" because it's
more readable (as well as everything else); but with large document
generation sometimes interpolation is just not feasible.

Ryan mentioned Java; the concatenation optimisation is exactly
analogous to a previous time in the same company I achieved a very
similar improvement by converting Java Strings to StringBuilders.

It's still not interpolation, but it can have a genuine, measurable
effect.  Knowing that + creates all those new instances while <<
doesn't can be useful and practical knowledge.

Caveat: I'm pretty damned sure Ruby was not the right language to be
using on that project. One makes do with what one is given.

--
 Matthew Kerwin, B.Sc (CompSci) (Hons)
 http://matthew.kerwin.net.au/
 ABN: 59-013-727-651

 "You'll never find a programming language that frees
 you from the burden of clarifying your ideas." - xkcd
Posted by botp (Guest)
on 2012-06-21 05:23
(Received via mailing list)
On Thu, Jun 21, 2012 at 8:49 AM, Avdi Grimm <groups@inbox.avdi.org> 
wrote:
>>
>
> I'm curious if you consider #each_with_object a reasonable choice for this.

indeed. i have 3 solns for this using 1) tap, 2) inject, 3)
each_with_object (or .each.with_object).

(Hash.new(0)).tap{|h| thingies.each{|i| h[i] += 1} }

thingies.inject(Hash.new(0)){|h,i| h[i] += 1; h} }

thingies.each.with_object(Hash.new(0)){|i,h| h[i] += 1}

looking at inject.. hmm, not sure. it does not seem so bad.. unless i
be so dogmatic.. nah, i've multiple religion.. it's more fun :)

best regards -botp
Posted by Avdi Grimm (Guest)
on 2012-06-21 06:30
(Received via mailing list)
On Wed, Jun 20, 2012 at 9:07 PM, Sam Duncan <sduncan@wetafx.co.nz> 
wrote:

> Is that basically the same thing wrapped in another method so that counter
> and o are yielded to a block?
>

Yes, and it's part of Enumerable:
http://ruby-doc.org/core-1.9.3/Enumerable.html#met...
Posted by Robert Klemme (robert_k78)
on 2012-06-21 08:21
(Received via mailing list)
On Thu, Jun 21, 2012 at 6:28 AM, Avdi Grimm <groups@inbox.avdi.org> 
wrote:
> On Wed, Jun 20, 2012 at 9:07 PM, Sam Duncan <sduncan@wetafx.co.nz> wrote:
>>
>> Is that basically the same thing wrapped in another method so that counter
>> and o are yielded to a block?
>
> Yes, and it's part of
> 
Enumerable:http://ruby-doc.org/core-1.9.3/Enumerable.html#met...

Enumerable also has
http://ruby-doc.org/core-1.9.3/Enumerable.html#met...

thingies.group_by(&:key)
thingies.group_by(&:key).map {|o,y| [o,y.length]}

Kind regards

robert
Posted by Henry Maddocks (Guest)
on 2012-06-22 01:31
(Received via mailing list)
On 21/06/2012, at 9:50 AM, Ryan Davis wrote:

> Given #inject's other alias, #reduce, it is obvious that you don't use #inject 
for building up other objects. Even in a functional style of programming you'd 
_never_ see it building up anything. You'd see it REDUCING (folding) an object. If 
#inject is applied in a non-folding manner, it isn't functional, it is just dumb. 
Don't pretend otherwise (and if you do pretend otherwise, go read more books on 
lisp--start with SICP). The second I see a semicolon (or return) in an inject, I 
immediately suspect that someone is writing clevar/stupid code.
>
> I don't have any recent examples from the list, but I'm on record in multiple 
mediums ranting against people who use #inject improperly. I'll make up one based 
on examples I've seen time and time again:

I come across this quite often, especially in Rails apps.

a = {:list => [1,2,3,4]}
b = {:list => [9,8,7,6,5]}

c = [a,b]

c.inject([]) {|memo, run| memo + run[:list] }

I always cringe when I see it but I haven't found an alternative that is 
as clear and concise.
collect and flatten looks ugly. I'd love to be able to do...

c.collect {|run| *run[:list]}


Henry
Posted by Justin Collins (Guest)
on 2012-06-22 02:32
(Received via mailing list)
On 06/21/2012 04:30 PM, Henry Maddocks wrote:
>> return) in an inject, I immediately suspect that someone is writing
>
>
> Henry

You could move the array outside:

a = {:list => [1,2,3,4]}
b = {:list => [9,8,7,6,5]}

c = [a,b]

all = []

c.each { |h| all.concat h[:list] }

Saves a little memory, too?

-Justin
Posted by Jan E. (jacques1)
on 2012-06-22 07:42
Justin Collins wrote in post #1065607:
> Saves a little memory, too?

Quod erat demonstrandum. :-D
Posted by botp (Guest)
on 2012-06-22 09:37
(Received via mailing list)
On Fri, Jun 22, 2012 at 7:30 AM, Henry Maddocks <hmaddocks@me.com> 
wrote:
> clear and concise.
> collect and flatten looks ugly. I'd love to be able to do...
>
> c.collect {|run| *run[:list]}

c.collect {|run| run[:list]} . flatten

or if there are only few elements,

a[:list] + b[:list]

kind regards -botp
Posted by Lars Haugseth (Guest)
on 2012-06-22 09:59
(Received via mailing list)
On 06/22/2012 01:30 AM, Henry Maddocks wrote:
> I always cringe when I see it but I haven't found an alternative that is
> as clear and concise.
> collect and flatten looks ugly. I'd love to be able to do...
>
> c.collect {|run| *run[:list]}

c.flat_map {|run| run[:list]}
Posted by Henry Maddocks (Guest)
on 2012-06-22 11:15
(Received via mailing list)
On 22/06/2012, at 7:58 PM, Lars Haugseth <ruby-talk@larshaugseth.com> 
wrote:

> c.flat_map {|run| run[:list]}

Excellent.  Never heard of it.

Henry
Posted by Avdi Grimm (Guest)
on 2012-06-24 20:12
(Received via mailing list)
On Jun 22, 2012 3:59 AM, "Lars Haugseth" <ruby-talk@larshaugseth.com> 
wrote:
> c.flat_map {|run| run[:list]}

Dude. If I knew that one I'd forgotten it. Thanks!
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.