Each by arity

I’ve always wondered, why?

c = []
[1,2,3,4].each{ |x,y| c << [x, y] }
c

gives us

[ [1,nil], [2,nil], [3,nil], [4,nil] ]

why not allow it to look at the arity of the block? And thus produce

[ [1,2], [3,4] ]

I don’t see how the former is ever of any use, but the later certainly
is. am i missing something obvious?

T.

Trans wrote:

why not allow it to look at the arity of the block? And thus produce

[ [1,2], [3,4] ]

I don’t see how the former is ever of any use, but the later certainly
is. am i missing something obvious?

What would it do with

[ [1,2], [3,4] ].each {|x,y| … }

On Sun, Jun 14, 2009 at 1:29 PM, Trans [email protected] wrote:

I don’t see how the former is ever of any use, but the later certainly
is. am i missing something obvious?

The main use I’ve seen is for iterating hashes, e.g.:

{:a => 1, :b => 2, :c => 3}.map { |k, v| “#{k}=#{v}” }.join(“,”)
=> “b=2,c=3,a=1”

Of course you could always massign against an incoming array, so this
case
is also covered by:

{:a => 1, :b => 2, :c => 3}.map { |(k, v)| “#{k}=#{v}” }.join(“,”)
=> “b=2,c=3,a=1”

The massign approach is nice with methods where you want to explode an
array
as arguments alongside other arguments:

{:a => 1, :b => 2, :c => 3}.inject(0) { |n, (k, v)| n + v }
=> 6

Personally I’m not a fan of this mode of arity handling and think
massign
and splats cover all the cases where it’s useful.

On Jun 14, 3:32 pm, Tony A. [email protected] wrote:

This could be expanded out with:

[ [1,2], [3,4] ].each {|(a,b),(c,d)| … }

and

[ [1,2], [3,4] ].each {|(x,y)| … }

still provides the old behavior.

And by “the old behavior” I presume you mean two iterations:

x = 1
y = 2

x = 3
y = 4

And how is that going to happen based on inspecting arity? At least
with 1.8.6, Proc.new {|x, y| } and Proc.new {|(x, y)| } both have an
arity of 2.

each_slice is a lot clearer.

Yes. Yes, it is.

On Sun, Jun 14, 2009 at 1:37 PM, Joel VanderWerf
[email protected]wrote:

What would it do with

[ [1,2], [3,4] ].each {|x,y| … }

One iteration, with:

x = [1,2]
y = [3,4]

This could be expanded out with:

[ [1,2], [3,4] ].each {|(a,b),(c,d)| … }

and

[ [1,2], [3,4] ].each {|(x,y)| … }

still provides the old behavior.

This effectively provides the same behavior as each_slice, without the
need
for a separate function, and having the “n” argument of each_slice
“implied”
in the arity of the block.

These are the cool little bits of syntactic sugar I love to see in Ruby,
but
I often find trying to do things with subtle little idiosyncrasies can
make
code confusing.

each_slice is a lot clearer.

On Sun, Jun 14, 2009 at 2:43 PM, Yossef M.
[email protected]wrote:

And by “the old behavior” I presume you mean two iterations:

Correct

And how is that going to happen based on inspecting arity? At least
with 1.8.6, Proc.new {|x, y| } and Proc.new {|(x, y)| } both have an
arity of 2.

So it does. Unfortunate.

On Jun 14, 4:43 pm, Yossef M. [email protected] wrote:

One iteration, with:
[ [1,2], [3,4] ].each {|(x,y)| … }

And how is that going to happen based on inspecting arity? At least
with 1.8.6, Proc.new {|x, y| } and Proc.new {|(x, y)| } both have an
arity of 2.

The underlying systems has to see the difference regardless. In fact
the current implementation has to do more, b/c it has to look at the
receiver itself and see that it contains arrays as elements in order
to know how to treat it, which is rather inefficient. (Also, I think
one could argue that either this arity is wrong, or the concept of
arity needs to be expanded with an added dimension.)

each_slice is a lot clearer.

Yes. Yes, it is.

Slice by arity seems pretty clear to me. Moreover, where is
#map_slice? It starts to look a lot clearer when we think of adding
“_slice” to every Enumerable method.

Go ahead and call out the Enumerator now.

T.

On Jun 14, 2009, at 9:22 PM, trans wrote:

What would it do with
[ [1,2], [3,4] ].each {|(a,b),(c,d)| … }
y = 2
receiver itself and see that it contains arrays as elements in order
to know how to treat it, which is rather inefficient. (Also, I think
one could argue that either this arity is wrong, or the concept of
arity needs to be expanded with an added dimension.)

At this point, it sounds more like we’re talking about OCaml style
pattern matching than proc arity. I, for one, would love to see Ruby
gain some sort of pattern matching to work with elements, but I fear
this would be difficult to implement on top of Ruby’s dynamism.

  • Josh

On Sun, Jun 14, 2009 at 11:22 PM, trans [email protected] wrote:

On Jun 14, 4:43 pm, Yossef M. [email protected] wrote:

And how is that going to happen based on inspecting arity? At least
with 1.8.6, Proc.new {|x, y| } and Proc.new {|(x, y)| } both have an
arity of 2.

The underlying systems has to see the difference regardless.

This is true. I was just pointing out that it’s not arity that stores
the
difference, and that makes this at least a little more difficult.

(Also, I think one could argue that either this arity is wrong, or the
concept of
arity needs to be expanded with an added dimension.)

You can argue a lot of things, and I’m sure you want to. Personally, I
have
yet to find the Ruby concept of block arity lacking, but then again I
don’t
try to do anything clever with it.

each_slice is a lot clearer.

Yes. Yes, it is.

Slice by arity seems pretty clear to me. Moreover, where is
#map_slice? It starts to look a lot clearer when we think of adding
“_slice” to every Enumerable method.

Why is map_slice needed? What’s the problem with calling each_slice and
then
map on the result? Is it a case of optimization? Are you worried about
the
resources you’ll be using? If so, it’s quite possible to use each_slice
to
iterate over one collection while appending results to an array. But
maybe
it’s not functional enough for you. Or core enough.

Go ahead and call out the Enumerator now.

I have no trouble with requiring ‘enumerator’ and explicitly using
exactly
the methods I want. I’d rather be clear than clever.

On 15.06.2009 20:17, Yossef M. wrote:

[Note: parts of this message were removed to make it a legal post.]

On Sun, Jun 14, 2009 at 11:22 PM, trans [email protected] wrote:

Go ahead and call out the Enumerator now.

I have no trouble with requiring ‘enumerator’ and explicitly using exactly
the methods I want. I’d rather be clear than clever.

Question is whether it is clever to duplicate all methods with another
one that has “_slice” appended to the name. I personally favor the
approach with Enumerator, e.g. depending on version

enum.enum_for(:each_slice, 2).map {|a,b| …}
enum.each_slice(2).map {|a,b| …}

  • especially since it works as easy for #each_cons and others. In other
    words: this is much more modular.

Kind regards

robert

On 15.06.2009 07:44, Joshua B. wrote:

What would it do with

the current implementation has to do more, b/c it has to look at the
receiver itself and see that it contains arrays as elements in order
to know how to treat it, which is rather inefficient. (Also, I think
one could argue that either this arity is wrong, or the concept of
arity needs to be expanded with an added dimension.)

But the problem is that we have two steps here:

  1. invocation of yield
  2. distributing arguments across block parameters

If I followed the thread properly then step 1 would have to be
influenced by knowledge about the block while currently only step 2
does. The code invoking yield needs to be in charge yet at the moment
it only has arity as information. I believe more information about the
block needs to be provided otherwise this cannot work because #each
needs to know this or else it cannot decide how many elements from the
current Enumerable must be handed off to the block.

At this point, it sounds more like we’re talking about OCaml style
pattern matching than proc arity. I, for one, would love to see Ruby
gain some sort of pattern matching to work with elements, but I fear
this would be difficult to implement on top of Ruby’s dynamism.

1.9 is actually moving into that direction:

irb(main):001:0> f = lambda {|a,*b,c| p a,b,c}
=> #<Proc:0x100d08bc@(irb):1 (lambda)>
irb(main):002:0> f[1,2]
1
[]
2
=> [1, [], 2]
irb(main):003:0> f[1,2,3]
1
[2]
3
=> [1, [2], 3]
irb(main):004:0> f[1,2,3,4]
1
[2, 3]
4
=> [1, [2, 3], 4]
irb(main):005:0>

And even in 1.8 you had some level of pattern matching, e.g.

irb(main):001:0> h={1=>2,3=>4}
=> {1=>2, 3=>4}
irb(main):002:0> h.inject(0) {|s,(k,v)| s + k + v}
=> 10

Kind regards

robert

On Jun 16, 2:10 am, Robert K. [email protected] wrote:

Question is whether it is clever to duplicate all methods with another
one that has “_slice” appended to the name. I personally favor the
approach with Enumerator, e.g. depending on version

Not clever… facetious. Of course Enumerator is useful, and with 1.9
fairly elegant. But rather, the question is why we must do

enum.each_slice(2).map {|a,b| …}

when

enum.map {|a,b| …}

could be made to work. This is especially interesting b/c it seems the
underlying implementation stands to be simpler and faster by doing so.

T.

On Jun 16, 7:43 pm, Robert D. [email protected] wrote:

just right to write it. Strange, strange.
Simply b/c cons is uncommon. We want to iterate over elements once and
only once, where as cons hits on the same elements multiple times. Of
course, I see no reason cons could not look at arity too.

With this idea, I’ve been wondering about possibilities for passing
blocks that might have one, two or more block arguments --the arity
unknown to the method processing them, and how that might or might
not be useful. One obvious issue is that arity > 1 produces Array
elements, whereas arity of 1 would not necessarily.

Now I am not so fond of enumerator, because it is really crazy.

x=%w{a b c d}
y = x.enum_for( :rand )
y.each do | a | p a end
=> 0.680141319345242

Do you agree with me that enum_for should use #public_send instead of #send?

Oh that’s nice. I did not think you could use it for anything but
Enumerable methods. I’m not sure it matters if they are public or not.
It still can be crazy.

y = x.enum_for(:to_s)
=> #Enumerable::Enumerator:0x7f64cd403c68

y.each{ |a| a }
=> “abcd”
y.each{ |a| a+‘!’ }
=> “abcd”
y.to_a
=> []

T.

On Tue, Jun 16, 2009 at 12:33 PM, trans[email protected] wrote:

enum.each_slice(2).map {|a,b| …}

when

enum.map {|a,b| …}

could be made to work. This is especially interesting b/c it seems the
underlying implementation stands to be simpler and faster by doing so.
That is intriguing and I am surprised.

However, why should

enum.map{ |a,b| }
have the semantics of
enum.each_slice( 2 )…
and not
enum.each_cons( 2 )… ?

Maybe we need to push the abstraction even higher. Hopefully you
understand what that sentence means, because I do not, but it felt
just right to write it. Strange, strange.

Now I am not so fond of enumerator, because it is really crazy.

x=%w{a b c d}
y = x.enum_for( :rand )
y.each do | a | p a end
=> 0.680141319345242

Do you agree with me that enum_for should use #public_send instead of
#send?

Cheers
Robert

Cheers
Robert

T.


Toutes les grandes personnes ont d’abord été des enfants, mais peu
d’entre elles s’en souviennent.

All adults have been children first, but not many remember.

[Antoine de Saint-Exupéry]

On Jun 17, 6:22 am, Robert D. [email protected] wrote:

underlying implementation stands to be simpler and faster by doing so.

coll.cons.each{ |a,b| …
coll.slice.each { |a,b|…

Hmm… I point out the Array#slice is already a method.

I like this because I am used to “waiting” in proxy objects for what
they are really to be used later, but it might be troubling for
others.
Do you think it is a good idea to implement it in Facets first and ask
for inclusion into the core after having some more feedback from
users?

Perhaps, but I must think about it some more. Presently one would do
(in 1.9):

coll.each_slice(n){ … }
coll.each_cons(n){ … }

If one wanted to map over that:

coll.map.each_slice(n){ … }
coll.map.each_cons(n){ … }

correct?

So how do we best achieve “by arity” without stepping on present toes?
It would be nice to have a coherent idea about it as well. It would be
interesting to look at this kind of thing in a variety of languages
actually. OCaml for instance has already been mentioned. I’ll have to
look that up.

T.

On Wed, Jun 17, 2009 at 6:49 AM, trans[email protected] wrote:

underlying implementation stands to be simpler and faster by doing so.

Maybe we need to push the abstraction even higher. Hopefully you
understand what that sentence means, because I do not, but it felt
just right to write it. Strange, strange.

Simply b/c cons is uncommon. We want to iterate over elements once and
only once, where as cons hits on the same elements multiple times. Of
course, I see no reason cons could not look at arity too.
Maybe we should map leave alone and use the enumerable approach

coll.cons.each{ |a,b| …
coll.slice.each { |a,b|…

I like this because I am used to “waiting” in proxy objects for what
they are really to be used later, but it might be troubling for
others.
Do you think it is a good idea to implement it in Facets first and ask
for inclusion into the core after having some more feedback from
users?

y = x.enum_for( :rand )
y.each do | a | p a end
=> 0.680141319345242

Do you agree with me that enum_for should use #public_send instead of #send?

Oh that’s nice. I did not think you could use it for anything but
Enumerable methods. I’m not sure it matters if they are public or not.
It still can be crazy.
Yeah as long as it justs sends the symbol we cannot avoid that, but I
guess this is the price to pay for having it flexible or maybe not? I
am not an expert on Enumerator internals. But I definetely want to
enumerate over my custom iterators.
T.

Cheers
R.


Toutes les grandes personnes ont d’abord été des enfants, mais peu
d’entre elles s’en souviennent.

All adults have been children first, but not many remember.

[Antoine de Saint-Exupéry]

On Fri, Jun 19, 2009 at 8:35 AM, Robert
Klemme[email protected] wrote:

2009/6/19 trans [email protected]:

Perhaps, but I must think about it some more. Presently one would do
(in 1.9):

coll.each_slice(n){ … }
coll.each_cons(n){ … }

No It would be
coll.each_cons # this is the proxy, it does not know about the slice
size yet
.map{ |a,b| … } # the proxy will look at the arity.

coll.each_slice(n).map{ … }
coll.each_cons(n).map{ … }

So how do we best achieve “by arity” without stepping on present toes?

I would simply stick with the existing behavior. :slight_smile:

I will not let the absence of use cases let ruin my work :wink: Now I have
always liked Tom’s reach for perfection, but on a practical base I
agree with Robert it is probably not worth it.

But on a theoretical ground I find it intriguing that the “proxy”
would do something similar to what an enumerator does.
An enumerator prepares a view of an enumerable for an actual method
delivering the behavior in a block.
One could say it “waits” for a message with a block.
The proxy will wait for a message too, and a block too, and will just
create the Enumerator on the fly with the information of the arity.
I even wonder if Enumerator could be the proxy with a little monkey
patch? And that might as well lead to a “default” behavior for map,
without the “proxy method” which would meet Tom’s original requirement
for Enumerators only

Thus [1,2,3].map{|a,b| [a,b]} would still be [[1, nil]…]
but
[1,2,3].to_enum.map{|a,b| [a,b]} would be [[1,2],[3,nil]]

Cheers
Robert

2009/6/19 trans [email protected]:

correct?

For mapping in 1.9 I would do

coll.each_slice(n).map{ … }
coll.each_cons(n).map{ … }

So how do we best achieve “by arity” without stepping on present toes?

I would simply stick with the existing behavior. :slight_smile:

Kind regards

robert

On Jun 19, 4:57 am, Robert D. [email protected] wrote:

coll.map.each_slice(n){ … }

I would simply stick with the existing behavior. :slight_smile:

I will not let the absence of use cases let ruin my work :wink: Now I have
always liked Tom’s reach for perfection, but on a practical base I
agree with Robert it is probably not worth it.

:slight_smile: Robert… someone understands me!

for Enumerators only

Thus [1,2,3].map{|a,b| [a,b]} would still be [[1, nil]…]
but
[1,2,3].to_enum.map{|a,b| [a,b]} would be [[1,2],[3,nil]]

Though I think that would be kind of confusing, if not problematic.

As others have pointed out there is no way to query the parenthetical
patterns, eg. |(a,b)| so this will be somewhat more limited but I was
thinking of something like:

[1,2,3].each.by_arity{ |a,b| … }

or conversely

[1,2,3].by_arity.each{ |a,b| … }

But we’d still need a way to do #each_cons by arity. To do this I
think there would need to be an equivalence. #each_slice is to #each
as #each_cons is to ___ ? So we define a method #cons that normally
means each_cons(1). Then…

[1,2,3].cons.by_arity{ |a,b| … }

or conversely

[1,2,3].by_arity.cons{ |a,b| … }

Though honestly this makes me wonder why #each just doesn take an
optional slice number – why have a completely separate method for
the special case of n=1, when an optional argument would do? Indeed
Facets has #each_by that does this.

P.S. I am half tempted to alias #each_by as #air, as in “air your
arguments”, and then name the cons version #conair :wink:

T.

2009/6/19 Robert D. [email protected]:

But on a theoretical ground I find it intriguing that the “proxy”
would do something similar to what an enumerator does.
An enumerator prepares a view of an enumerable for an actual method
delivering the behavior in a block.
One could say it “waits” for a message with a block.
The proxy will wait for a message too, and a block too, and will just
create the Enumerator on the fly with the information of the arity.
I even wonder if Enumerator could be the proxy with a little monkey
patch?

???

irb(main):002:0> %w{foo bar}.each_cons 2
=> #Enumerator:0x1016a55c

And that might as well lead to a “default” behavior for map,
without the “proxy method” which would meet Tom’s original requirement
for Enumerators only

Are you mixing up Enumerable and Enumerator?

Kind regards

robert