For or each?

tekwiz · September 20, 2008, 9:59pm

I just used the new roodi gem to check out some of my code that has a
lot of algorithmic code. It gave me a number of issues with the
phrase “Don’t use ‘for’ loops. Use Enumerable.each instead.” I prefer
for loops as opposed to using each simply because it’s what I’m used
to coming from C-style languages.

Example:

This is what I do:

for i in 0…str.size
…
end

This is what roodi would have me do

(0…str.size).each do |i|
…
end

Is there a real, substantive reason to use each instead of for? Or is
it simply just a preference issue?

Thanks,

tekwiz · September 20, 2008, 10:41pm

tekwiz wrote:

This is what roodi would have me do

(0…str.size).each do |i|
…
end

Is there a real, substantive reason to use each instead of for? Or is
it simply just a preference issue?

It leaves you closer to a refactor to .map or .inject or .select or
.reject or
.delete_if or .each_index or .each_with_index or …

tekwiz · September 20, 2008, 10:45pm

Phlip wrote:

tekwiz wrote:

This is what roodi would have me do

(0…str.size).each do |i|

It leaves you closer to a refactor to .map or .inject or .select or
.reject or .delete_if or .each_index or .each_with_index or …

It also hints at:

str.each do |ch|

…

tekwiz · September 20, 2008, 10:53pm

Phlip wrote:

str.each do |ch|

That’d be each_char, I suppose. String#each is each_line.

tekwiz · September 20, 2008, 11:16pm

On Sep 20, 3:32 pm, Phlip [email protected] wrote:

str.each do |ch|

So, it’s a code-readability issue and not a functional or complexity
issue?

tekwiz · September 20, 2008, 11:18pm

Phlip wrote:

That’d be each_char, I suppose. String#each is each_line.

IOW:

… or .each_char or .each_line or .each_byte or …

What?

Confused,
Sebastian

tekwiz · September 21, 2008, 12:05am

… or .each_char or .each_line or .each_byte or …

What?

I illustrated that you augmented the “or” list from my first post.

tekwiz · September 21, 2008, 12:16am

It’s interesting that array access using ‘each’ seems to be much
faster on my machine. In C, indexed-based for-loops are slow. It’s
faster to increment pointers. Maybe it’s similar under Ruby’s hood.

ruby 1.86 (OS X PPC)
Rehearsal --------------------------------------------------
for loop 1.200000 0.000000 1.200000 ( 1.206533)
each 0.510000 0.000000 0.510000 ( 0.511994)
----------------------------------------- total: 1.710000sec

                  user     system      total        real

for loop 1.190000 0.000000 1.190000 ( 1.190023)
each 0.500000 0.000000 0.500000 ( 0.508636)

ruby 1.9 (Same OS X PPC)
Rehearsal --------------------------------------------------
for loop 2.370000 0.010000 2.380000 ( 2.376402)
each 1.770000 0.000000 1.770000 ( 1.775798)
----------------------------------------- total: 4.150000sec

                  user     system      total        real

for loop 2.310000 0.010000 2.320000 ( 2.316495)
each 1.780000 0.000000 1.780000 ( 1.775958)

Joe

tekwiz · September 21, 2008, 12:20am

tekwiz wrote:

It leaves you closer to a refactor to .map or .inject or .select or
.reject or .delete_if or .each_index or .each_with_index or …

So, it’s a code-readability issue and not a functional or complexity
issue?

‘for’ is arguably more readable. And it’s not a performance issue - I
suspect
the opcodes will be the same. It is very much a technical issue.

Good code is readable, minimal, and maintainable. Maintaining code
requires
adding new features. Code should always be as ready for change as
possible, so
much of our design rules (such as “object orientation”) are really
rubrics for
improving the odds that the next change comes easy.

This is an easier change…

array.each{|x| … } -> array.map{|x| … }

…than this:

for x in array … -> array.map{|x| … }

Further, your original example was very C-like. The iteration variable
was the
array’s index. Most iteration directly addresses each array’s element,
without
regard to its index. So ‘for i in 0…str.size’ is often more excessive
than
‘for x in str’.

Using .each leads to the correct mindset. Put another way, ‘for’ is an
obsolete
concept - a legacy of languages without true iteration.

tekwiz · September 21, 2008, 12:25am

Joe Wölfel wrote:

It’s interesting that array access using ‘each’ seems to be much
faster on my machine.

Does ‘for’ reevaluate its range after each tick? That would give ‘for’ a
single
technical advantage over .each, in the very rare chance you need that.

Otherwise, where does the time go?

A quick experiment with RubyNode just showed Ruby generates different
opcodes
for ‘each’ and ‘for’. (By contrast, the notorious ternary operator, ? :,
generates the same opcodes as an equivalent ‘if then else end’
construction.)

tekwiz · September 20, 2008, 11:05pm

str.each do |ch|

That’d be each_char, I suppose. String#each is each_line.

IOW:

… or .each_char or .each_line or .each_byte or …

tekwiz · September 21, 2008, 12:31am

I don’t think the minimal editing distance between #each and #map and
friends has anything to do. It is so unlikely that an #each becomes an
#inject that I don’t think that’s a good explanation for why people
prefer it over for. If <> becomes an #inject you just go an
edit whatever you need. Nah I don’t think so.

The analogous for loop is written like this

for item in collection
# do something with item
end

I think #each has become the common for-idiom in Ruby because, you
know, the community has converged to that choice by themselves. That’s
not very tangible, and perhaps may be due to the fact that blocks are
ubiquitous in Ruby and #each is syntactically closer in your head to a
lot of other stuff in Ruby.

I’d say #each is not that much favoured in ERb templates.

tekwiz · September 21, 2008, 12:34am

You can try my test if you like. I haven’t checked it carefully.
But it does seem like ‘each’ is much faster under both ruby 1.86 and
ruby 1.9. The code is below.

require ‘rubygems’
require ‘benchmark’

a = (1…10000).to_a
Benchmark.bmbm 15 do |bench|

bench.report “for loop” do
x = 0
100.times do
for i in 0…(a.size)
x += a[i]
end
end
end

bench.report “each” do
x = 0
100.times do
a.each do |i|
x += i
end
end
end

end

Joe

tekwiz · September 21, 2008, 12:59am

On Sun, Sep 21, 2008 at 12:26 AM, Joe WÃ¶lfel [email protected] wrote:

                   for i in 0...(a.size)
                           x += a[i]
                   end

Hey that’s not the for-version of #each, nobody iterates by index in
dynamic languages unless he needs to.

The message to the OP is: for versus each is a stylistic discussion in
the sense that they are equivalent if used the way I showed. A
regular loop over a collection using indexes is never used in dynamic
languages because it is much slower than array indexing in C. That’s
the price you pay for such flexible estructures. In Ruby an array has
nothing to do behind the scenes with a C array.

That’s why VHLLs provide specific efficient and concise collection
iterators:

Perl

foreach my $user (@users) {
…
}

Ruby

for user in users
…
end

or

users.each do |user|
…
end

Python

for user in users
…

Lisp

(dolist (user users)
(…))

tekwiz · September 21, 2008, 1:05am

On Saturday 20 September 2008 16:07:20 tekwiz wrote:

It also hints at:

str.each do |ch|

So, it’s a code-readability issue and not a functional or complexity
issue?

These things are not entirely separate – readable code is more likely
to be
functional and maintainable.

Or, maybe a better way of saying it is, the code should not merely be
readable, it should be expressing your intent.

Each is far more abstract than for. Take a simple array:

a = [‘one’,‘two’,‘three’]
for i in 0…a.size
puts “Give me a #{a[i]}”
end

That’s more prone to not work, as there are more visible moving parts,
which
means more for you to think about, and more that can go wrong – you
might
type the wrong variable name in the a[i], for example, or type 0…a.size
instead of 0…a.size.

It also doesn’t express your intent. You don’t really need to know or
care
where you are in that array, in this particular algorithm. You only need
to
know which item you’re going to print right now. So:

[‘one’,‘two’,‘three’].each do |x|
puts “Give me a #{x}”
end

Shorter, more readable (to me), quicker to type, and has the added
benefit
that in the above example, that array falls out of scope as soon as the
loop
ends, so it can be collected sooner.

It also gives you a bit more flexibility. Suppose you’re iterating over
something that someone passed in – that means that I have to pass in
something that behaves like an array. It needs to have an accurate []
method,
probably supporting random access, even if you’ll only access it
sequentially. It needs to have a length method, etc.

Which makes things quite a bit more awkward. What if I’m reading lines
from a
file? Your way would force me to count every line in the file before I
even
get started. What if it’s a complex computation, like all the primes
less
than a given number? I have to calculate them all out ahead of time, and
either store that array (wasting memory), or recalculate them from the
beginning.

There’s more to it, of course – you could imagine an each method which
runs
in parallel threads, and I’m sure someone has written such a thing.

None of these will apply to every situation. It’s entirely possible it’s
an
internal data structure. Even internal data structures could benefit
from
some flexibility, but maybe you’ll never touch this algorithm again.

But then, I don’t really see a downside to doing it with ‘each’, instead
of ‘for’, other than that ‘for’ is what you’re used to.

tekwiz · September 21, 2008, 1:06am

On Sun, Sep 21, 2008 at 12:52 AM, Phlip [email protected] wrote:

just an /ad populum/ thing…)

Another reason to use .each is your collection-like class might override it
to do something cool…

There’s a technical difference in scopes between them, but for indeed
calls #each. A for loop

for user in users
…
end

assumes that users is whatever object that responds to #each, calls
the iterator on users and yields the value.

tekwiz · September 21, 2008, 1:16am

Philip, I hadn’t thought about your refactoring arguments. I think
I’m swayed by them. I do change an each to a map, etc., on
occasion. Also, it seems simpler and less confusing to have one
simple grammatical construction that does so many things.

tekwiz · September 21, 2008, 1:00am

Joe Wölfel wrote:

You can try my test if you like. I haven’t checked it carefully.
But it does seem like ‘each’ is much faster under both ruby 1.86 and
ruby 1.9. The code is below.

I just experimented with ‘for’ and found it does not reevaluate its
header each
time it runs. That’s the only thing that could have explained the time
difference, so maybe Matz & Co. have simply neglected ‘for’ while
optimizing
.each, which everyone uses. (Though it’s still technically superior; not
just an
/ad populum/ thing…)

Another reason to use .each is your collection-like class might override
it to
do something cool…

tekwiz · September 21, 2008, 2:19am

On Sun, Sep 21, 2008 at 07:04:49AM +0900, Joe Wölfel wrote:

It’s interesting that array access using ‘each’ seems to be much faster on
my machine. In C, indexed-based for-loops are slow. It’s faster to
increment pointers. Maybe it’s similar under Ruby’s hood.

Actually for loops are faster than each. Since it doesn’t
introduce a block, there’s no extra scope created. Not much faster,
but they are used in the computer language shootout, for instance.

You folks can argue all you want about the look of the for but
you’re forgetting the utility of having two nice choices. One which
creates scope and one that doesn’t. Don’t let this Roodi lib boss
you around! You can make up your own mind about things.

_why

tekwiz · September 21, 2008, 2:13am

You cannot rationalize a convention. Conventions happen, it is
difficult to explain why things are the way they are when they are
mostly stylistic.

You use two spaces in Ruby because you want your code to be idiomatic.
Can it be said that two spaces are obviously better than four or
eight? I don’t think so, it is just a convention. And when you write
Perl or Java you use four. That’s it.

In my opinion you use #each in your Ruby code because that’s what
people use. That’s what the book you first read use, that’s what
everybody writes. Your code is supposed to use #each, you learn that
when you learn Ruby and probably force a change in your mind a
priori if you come from almost any other language. Just to follow the
conventions and write code that resembles what the community has
converged into.

You can argue that the convention has converged because iterators blah
and yielding blah, but

for user in users
…
end

is crystal clear, readable, has all the benefits of #each because it
uses #each, and what not.

As a counterargument, in ERb templates for-loops are not perceived as
“funny” and they are commonly used.