Ruby 1.9 and collect

Under ruby 1.8.6, running

[1, 2, 3].collect

results in an array of [1, 2, 3]. In 1.9, however, running

[1, 2, 3].collect

results in an Enumerator, which I then have to call .to_a on.
Alternately, I can supply a block

[1, 2, 3].collect {|x| x }

and end up with an array. So I guess that means that the default block
of {|x| x} was removed from the function. I was just wondering if
anyone had any insight into why that change was made, because it feels
very counter intuitive to me. I really liked the elegance of opening
up a file and just calling collect to turn it into an array.

On Sat, Dec 5, 2009 at 12:40 PM, Raul J. [email protected]
wrote:

[1, 2, 3].collect {|x| x }

and end up with an array. So I guess that means that the default block
of {|x| x} was removed from the function. I was just wondering if
anyone had any insight into why that change was made, because it feels
very counter intuitive to me. I really liked the elegance of opening
up a file and just calling collect to turn it into an array.

Actually I find it counterintuitive that Enumerable#collect works
without a block in Ruby 1.8. In fact, that little quirk isn’t
documented in the 1.8 version of the pickaxe.

And I would suggest that even more elegant (and intention revealing)
than calling collect to turn something into an array would be to use
to_a whilch will work on any Enumberable, including files since File
is a subclass of IO and IO mixes in Enumerable.


Rick DeNatale

Blog: http://talklikeaduck.denhaven2.com/
Twitter: http://twitter.com/RickDeNatale
WWR: http://www.workingwithrails.com/person/9021-rick-denatale
LinkedIn: http://www.linkedin.com/in/rickdenatale

That was exactly the explanation I was looking for. Thanks.

On Saturday 05 December 2009 11:40:54 am Raul J. wrote:

Under ruby 1.8.6, running

[1, 2, 3].collect

results in an array of [1, 2, 3].

Huh. That’s odd.

[1, 2, 3].collect

results in an Enumerator, which I then have to call .to_a on.

Given that the original is an array, you probably wouldn’t actually do
this.
Let’s see…

I was just wondering if
anyone had any insight into why that change was made, because it feels
very counter intuitive to me.

I agree with Rick – the basic idea here is that all enumerator methods
will
turn into Enumerators when called without a block. I especially like
doing it
with variants of each.

I really liked the elegance of opening
up a file and just calling collect to turn it into an array.

An array of what? I wouldn’t call that elegant, I’d call that
confusingly
magical. I think this is much easier to understand:

lines = open(‘foo’) {|file| file.each_line.to_a}

Now I know exactly what’s happening – make an array of each line in the
file.
You could argue that this should be the default, but if you just did
‘each’,
it’s not obvious. I could, after all, be doing something like this:

chars = open(‘foo’) {|file| file.each_char.to_a}

Or, if it’s a binary file, something like this:

bytes = open(‘foo’) {|file| file.each_byte.to_a}

You could also argue that the to_a should be unnecessary, but I don’t
agree.
For one, it’s really very useful to have that kind of shorthand for an
Enumerator, for those few times it makes sense. For example:

open ‘foo’ do |file|
lines = file.each_line
begin
loop do
line = lines.next.chomp
while line =~ /\$/
line = line.chop + lines.next.chomp
end
# do something with each continued line
end
rescue StopIteration
end
end

I’m sure I could come up with other examples, too, including silly ones
like:

(1…10).each_cons(2).map{|a,b| a*b}

Basically, I’m trying to say that Enumerators are cool – I can think of
all
kinds of crazy uses for them, including one-liners like the above that
rely on
this behavior. I can’t really think of a good reason for them to return
arrays
– after all, you can always do a to_a at the end of a chain like that,
if you
actually need an array.

And anyway, if it’s just something that someone will be calling ‘each’
on at
some point, you probably don’t need an array, and it’s more efficient to
use
an enumerator anyway. After all, here’s another contrived example:

byte_pairs = open(‘foo’) {|file| file.each_char.each_slice(2).to_a }
line_pairs = open(‘foo’) {|file| file.each_line.each_slice(2).to_a }

Both would work if each_char/each_line made an array, but the above is
much
more efficient.

On Sat, Dec 5, 2009 at 3:19 PM, David M. [email protected]
wrote:

On Saturday 05 December 2009 11:40:54 am Raul J. wrote:

I was just wondering if
anyone had any insight into why that change was made, because it feels
very counter intuitive to me.

I agree with Rick – the basic idea here is that all enumerator methods will
turn into Enumerators when called without a block.

Well in 1.9 most enumerable methods which take a block will return an
enumerator if no block is given but not all.

Some, like all? and any? have an optional block which can be used to
override the behavior if not block is given of using an identity
block, so that collection.all? returns true if none of the elements
are nil or false, and any? returns true if any element is not nil or
false, another is grep which uses a block to transform matching
elements in the results. Other methods like this include sort, min
and max. Some of this is for compatibility with 1.8 for methods which
already had optional block arguments and didn’t return enumerators if
the block isn’t given.

And Ruby 1.9 even added some more methods which work in similar
fashion, e.g. count, minmax, one?,and none?

But, indeed, most, if not all of the enumerable methods which required
blocks in Ruby 1.8 now return enumerables if no block is given in 1.9.


Rick DeNatale

Blog: http://talklikeaduck.denhaven2.com/
Twitter: http://twitter.com/RickDeNatale
WWR: http://www.workingwithrails.com/person/9021-rick-denatale
LinkedIn: http://www.linkedin.com/in/rickdenatale

Hi –

On Sun, 6 Dec 2009, Raul J. wrote:

[1, 2, 3].collect {|x| x }

and end up with an array. So I guess that means that the default block
of {|x| x} was removed from the function. I was just wondering if
anyone had any insight into why that change was made, because it feels
very counter intuitive to me. I really liked the elegance of opening
up a file and just calling collect to turn it into an array.

As others have said, there are better ways (mainly File.readlines). As
for map/collect in general: it’s hard to come up with a useful value
when it’s called without a block. I don’t know of any use cases, in
fact, no matter what it returns. There was, briefly, a time in 1.9
when you could do:

enum = array.enum_for(:map, &some_lambda)

and then run enumerable operations, pre-mapped so to speak, off of
enum. But you can’t do that any more, so I think map without a block
is essentially useless. (I’d be interested if anyone knows of any
remaining use cases.)

David

Where you have been using
File.open(“some_file”).collect #in 1.8

Most ruby programmers probably would’ve used:
File.readlines(“some_file”)

It is shorter (by 3 whole characters), does the same thing as your
code, and has the added benefit of not using collect in a quirky way.
Array#collect is for sending data into a block and getting the result
of the block back collected into an array.
When used without a block the usage is opaque. And though terse it is
hard to understand the implied {|x| x }.

Try File.readlines(“some_file”). Try it and you may I say. Try it, try
it and you may…
Tim

On Dec 5, 9:40 am, Raul J. [email protected] wrote:

[1, 2, 3].collect {|x| x }

and end up with an array. So I guess that means that the default block
of {|x| x} was removed from the function. I was just wondering if
anyone had any insight into why that change was made, because it feels
very counter intuitive to me. I really liked the elegance of opening
up a file and just calling collect to turn it into an array.

On Sun, Dec 6, 2009 at 2:34 PM, David A. Black [email protected]
wrote:

As others have said, there are better ways (mainly File.readlines). As
for map/collect in general: it’s hard to come up with a useful value
when it’s called without a block.

The use case I can see is to turn an Enumerable into an Array. Which
means that you should be using to_a instead, and checking,
File.open(“foo”).to_a does indeed do the right thing.

martin

Hi –

On Sun, 6 Dec 2009, Martin DeMello wrote:

On Sun, Dec 6, 2009 at 2:34 PM, David A. Black [email protected] wrote:

As others have said, there are better ways (mainly File.readlines). As
for map/collect in general: it’s hard to come up with a useful value
when it’s called without a block.

The use case I can see is to turn an Enumerable into an Array. Which
means that you should be using to_a instead, and checking,
File.open(“foo”).to_a does indeed do the right thing.

It will work (using map instead of to_a in 1.8), but I don’t think
it’s a real use case, in the sense that it would never be the best
way. In other words, there’s a use case for what map without a block
does, but not for using map to do it.

I’m not sure there are any real uses for map returning an enumerator,
as in 1.9, either. You can do:

a = [1,2,3]
e = a.map
e.each {|x| x * 10 } # [10, 20, 30]

but that’s just a long way of writing map.

David

On Dec 6, 6:08 pm, Ken B. [email protected] wrote:

when it’s called without a block.
I’m not sure there are any real uses for map returning an enumerator, as
[“foo”,“bar”,“baz”].map.each_with_index{|val,idx| “#{idx} #{val}”}
=> [“0 foo”,“1 bar”,“2 baz”]


Chanoch (Ken) Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.http://www.iit.edu/~kbloom1/

Not sure how that makes more sense than
irb(main):005:0> [“foo”, “bar”, “baz”].each_with_index.map {|val, idx|
“#{idx} #{val}”}
=> [“0 foo”, “1 bar”, “2 baz”]

On Sun, 06 Dec 2009 22:09:53 +0900, David A. Black wrote:

in 1.9, either. You can do:

a = [1,2,3]
e = a.map
e.each {|x| x * 10 } # [10, 20, 30]

but that’s just a long way of writing map.

David

[“foo”,“bar”,“baz”].map.each_with_index{|val,idx| “#{idx} #{val}”}
=> [“0 foo”,“1 bar”,“2 baz”]

2009/12/6 Martin DeMello [email protected]:

On Sun, Dec 6, 2009 at 2:34 PM, David A. Black [email protected] wrote:

As others have said, there are better ways (mainly File.readlines). As
for map/collect in general: it’s hard to come up with a useful value
when it’s called without a block.

The use case I can see is to turn an Enumerable into an Array. Which
means that you should be using to_a instead, and checking,
File.open(“foo”).to_a does indeed do the right thing.

Only that it misses closing the file properly. Rather to

File.readlines “foo”

which has been mentioned elsewhere in this thread.

Cheers

robert

Hi –

On Mon, 7 Dec 2009, Ken B. wrote:

for map/collect in general: it’s hard to come up with a useful value

David

[“foo”,“bar”,“baz”].map.each_with_index{|val,idx| “#{idx} #{val}”}
=> [“0 foo”,“1 bar”,“2 baz”]

Sort of an ironic example, in that I spent nine years lobbying for
map_with_index :slight_smile: You can do it this way though:

array.map.with_index {|e,i| … }

with_index being one of the few methods that Enumerator has that
aren’t from Enumerable.

My main problem with all this is the fact that you can’t do this any
more:

enum = array.enum_for(:map, &some_lambda)

but I guess the mourning period for that should be officially declared
over :slight_smile:

David