Proposal/RFQ: Hash#values/keys with block

Dirk_T · January 29, 2008, 2:11pm

Hi!

Hash#values and Hash#keys return the values/keys of a hash as an array.
Wouldn’t it be nice if you could just add a block to specify which
values/keys you want returned?

hash = { ‘a’ => 1, ‘bb’ => 2, ‘c’ => 33, ‘dd’ => 44}

hash.keys #=> [“a”, “bb”, “c”, “dd”]
hash.values #=> [1, 2, 33, 44]

hash.values {|k,v| k > ‘a’ } #=> [2, 33, 44]
hash.values {|k,v| v % 2 == 0 } #=> [2, 44]
hash.keys {|k,v| k.size == 1 } #=> [“a”, “c”]

Alternatively they could be implemented as new methods called for
example
hash.values_if {}
hash.keys_if {}
analogue to the already defined Hash#values_at.

Well, I would prefer the extension of Hash#keys/values with blocks.
What do you think?

Dirk

Dirk_T · January 29, 2008, 2:32pm

On Jan 29, 2008 2:10 PM, Dirk T. [email protected] wrote:

hash = { ‘a’ => 1, ‘bb’ => 2, ‘c’ => 33, ‘dd’ => 44}

hash.keys #=> [“a”, “bb”, “c”, “dd”]
hash.values #=> [1, 2, 33, 44]

hash.values {|k,v| k > ‘a’ } #=> [2, 33, 44]

I am sure you could use external iterators for that and extend your Hash
class.
(I’ll try it later)

hash.values {|k,v| v % 2 == 0 } #=> [2, 44]

This one goes easily with a select call:
hash.values.select {|el| el % 2 == 0 } #=> [2, 44]

hash.keys {|k,v| k.size == 1 } #=> [“a”, “c”]

Same here:
hash.keys.select {|el| el.size == 1 }

Alternatively they could be implemented as new methods called for
example
hash.values_if {}
hash.keys_if {}
analogue to the already defined Hash#values_at.

Well, I would prefer the extension of Hash#keys/values with blocks.
What do you think?

I think Array#select does a fine job. You can still write your own
helper methods to implement the behaviour which you need.

Dirk_T · January 29, 2008, 2:51pm

2008/1/29, Thomas W. [email protected]:

On Jan 29, 2008 2:10 PM, Dirk T. [email protected] wrote:

hash.keys {|k,v| k.size == 1 } #=> [“a”, “c”]

Same here:
hash.keys.select {|el| el.size == 1 }

Additional note: If selection needs to be done on key and value you can
do this:

hash.select {|k,v| k.size == 1 }.map {|k,v| k}

or

irb(main):006:0> hash.inject([]) {|ks,(k,v)| ks << k if k.size == 1; ks}
=> [“a”, “c”]

Well, I would prefer the extension of Hash#keys/values with blocks.
What do you think?

I am waiting to see whether this is generally needed. As far as I
remember this has not been suggested before. If there is no
overwhelming request for this I opt for not making this standard.

Kind regards

robert

Dirk_T · January 29, 2008, 6:30pm

Am 29 Jan 2008 um 22:50 hat Robert K. geschrieben:

I am waiting to see whether this is generally needed. As far as I
remember this has not been suggested before. If there is no
overwhelming request for this I opt for not making this standard.

Why make it standard?

No regression
Until now, a block after Hash#keys/values is just ignored.
So, there would be no regression for old code.
No clutter
As it is just an additional behaviour of exiting methods, there would
be no new methods cluttering the standard library.
Speed
I think that this will not slow down ruby as a whole, but I’m sure that

hash.values {|k,v| v % 2 == 0 }

would be definitely faster than

hash.select {|k,v| v % 2 == 0 }.map {|k,v| v}

especially with big hashes, because you make only one array directely,
not two and don’t have to call map additionally.
And let us not begin about inject and speed…

Rubyness
This is difficult to describe, but one point I really, really like
about Ruby is blocks. There are a lot of methods in the standard
library which use blocks to ‘finetune’ the method.
Use them pure and you get all of it.
Or use them with a block and you get a subset or a modified result.
I got so used to it, that I first wrote Hash#keys with a block without
looking in ri before.
I think this would fit perfectly in, as it is, at least for me, typical
Ruby syntax.
Beauty
I really like inject. (Though maybe not as much as you do, Robert. )
But if I compare

hash.values {|k,v| k.size == 1 }

with

hash.select {|k,v| k.size == 1 }.map {|k,v| v}
hash.inject([]) {|ks,(k,v)| ks << v if k.size == 1; ks}

then I think it is obvious, why I really prefer the first variant.
It is direct, short and clear. Less characters, but more obvious.
That’s what I love about Ruby

=======

Robert, don’t you think it would fit nicely in the standard lib?

What do the others think?
Please tell me your opinion and give your +1/-1 to the extension of
Hash#key/values in the standard lib!

Dirk

Dirk_T · January 29, 2008, 8:21pm

On 29.01.2008 18:29, Dirk T. wrote:

So, there would be no regression for old code.
True.

No clutter
As it is just an additional behaviour of exiting methods, there would
be no new methods cluttering the standard library.

That’s true. However, there are also other things to consider: often it
is more flexible to spread functionality across different methods. So,
lesser methods is not a value per se.

not two and don’t have to call map additionally.
You would be surprised to see how fast seemingly more complex bits of
code often are. I myself have been surprised in the past.

And let us not begin about inject and speed…

See? #inject can often be used to avoid multiple traversals but yet
often a combination with #map is faster.

Rubyness
This is difficult to describe, but one point I really, really like
about Ruby is blocks. There are a lot of methods in the standard
library which use blocks to ‘finetune’ the method.
Use them pure and you get all of it.
Or use them with a block and you get a subset or a modified result.

I am not sure I agree here. If you take #each or #select, these do not
make any sense without a block so the block is certainly not fine tuning
the method. What examples do you have in mind?

My soft point here is, that it somehow feels odd to combine two
completely orthogonal things (key extraction and entry selection) in a
single method to me.

I got so used to it, that I first wrote Hash#keys with a block without
looking in ri before.

This is interesting: it never occurred to me to use #keys with a block.
Off the top of my head I cannot remember a situation where I needed
selection and key extraction at the same time (which does not mean I
never did it).

I think this would fit perfectly in, as it is, at least for me, typical
Ruby syntax.

Beauty
I really like inject. (Though maybe not as much as you do, Robert. )

LOL

It is direct, short and clear. Less characters, but more obvious.
I find the #select / #map solution more obvious. If I was writing the
functionality of #keys with a block, I’d probably name the method
#select_keys.

That’s what I love about Ruby

=======

Robert, don’t you think it would fit nicely in the standard lib?

I would not say that it does not fit but I am not totally convinced. I
am trying to employ public debate to find out the best solution.

What do the others think?
Please tell me your opinion and give your +1/-1 to the extension of
Hash#key/values in the standard lib!

Dirk, I am glad that once again there is an interesting discussion.
Thanks for that!

Kind regards

robert

Dirk_T · January 30, 2008, 2:09pm

Am 30 Jan 2008 um 4:20 hat Robert K. geschrieben:

On 29.01.2008 18:29, Dirk T. wrote:

Am 29 Jan 2008 um 22:50 hat Robert K. geschrieben:

I am waiting to see whether this is generally needed. As far as
I remember this has not been suggested before. If there is no
overwhelming request for this I opt for not making this standard.

Why make it standard?

(…)

directely, not two and don’t have to call map additionally.

You would be surprised to see how fast seemingly more complex bits of
code often are. I myself have been surprised in the past.

And let us not begin about inject and speed…

See? #inject can often be used to avoid multiple traversals but yet
often a combination with #map is faster.

That’s true. So this has to be proven by implementation.

tuning the method. What examples do you have in mind?
Below I put a few examples. I’m sure if I would really dive into the
standard lib, there might be better ones, but these came to my mind.

h1 = { “a” => 20, “b” => 10 }
h2 = { “b” => 25, “c” => 30 }

#Enumerable#grep
p h1.grep(Array) #=> [[“a”, 20], [“b”, 10]]
p h1.grep(Array) {|(k,v)| v } #=> [20, 10]

#Hash#merge
p h1.merge(h2) #=>{“a”=>20, “b”=>25, “c”=>30}
p h1.merge(h2){|k,ov,nv| nv=ov if ov<nv} #=>{“a”=>20, “b”=>10, “c”=>30}

#Hash#fetch
p h1.fetch(“z”) rescue p $! #=> #<IndexError: key not found>
p h1.fetch(“z”) {|v| ‘wrong key’} #=> “wrong key”

#Hash#sort
p h1.sort #=> [[“a”, 20], [“b”, 10]]
p h1.sort {|(k,v),(k2,v2)| v <=> v2} #=> [[“b”, 10], [“a”, 20]]

My soft point here is, that it somehow feels odd to combine two
completely orthogonal things (key extraction and entry selection) in a
single method to me.

It does not feel orthogonal for me. This is a combination, true, but I
would not call it orthogonal.
In real life most tasks are combinatory.
Ruby has a lot of constructs which mimic human thinking and use this
for a more intuitive access to programming.
Intuitively for me
hash.keys {|k,v| v > 200 }
was the same as
hash.select {|k.v| v > 200 }
Just the returned values differ.
Hash#select gives you the chosen key-value pairs in an array and
Hash#keys/values with a block only the chosen keys or values.

(…)

Robert, don’t you think it would fit nicely in the standard lib?

I would not say that it does not fit but I am not totally convinced.
I am trying to employ public debate to find out the best solution.

Well, it seems it is more of a private debate. I’m a little
disappointed by this lack of interest. Do you think I should have
posted it to ruby-core?

Dirk, I am glad that once again there is an interesting discussion.
Thanks for that!

Thanks a lot for your comments.

Kind regards
Dirk

Dirk_T · January 30, 2008, 4:13pm

2008/1/30, Dirk T. [email protected]:

Am 30 Jan 2008 um 4:20 hat Robert K. geschrieben:

I am not sure I agree here. If you take #each or #select, these do
not make any sense without a block so the block is certainly not fine
tuning the method. What examples do you have in mind?

Below I put a few examples. I’m sure if I would really dive into the
standard lib, there might be better ones, but these came to my mind.

Nice list. Yes, indeed, I see what you mean.

My soft point here is, that it somehow feels odd to combine two
completely orthogonal things (key extraction and entry selection) in a
single method to me.

It does not feel orthogonal for me. This is a combination, true, but I
would not call it orthogonal.

Why not? IMHO the term fits well concepts involved:

“The term is used loosely to mean mutually independent or well
separated.”

“Mutually independent; well separated;”

In real life most tasks are combinatory.

Yeah, but in software engineering the cost of combining different
tasks us decreased modularity. I do not say that it should not be
done - often there are good reasons, but I have seen more often than
not things lumped together despite the lack of good reasons.

Robert, don’t you think it would fit nicely in the standard lib?

I would not say that it does not fit but I am not totally convinced.
I am trying to employ public debate to find out the best solution.

Well, it seems it is more of a private debate.

Right you are.

I’m a little
disappointed by this lack of interest. Do you think I should have
posted it to ruby-core?

No, I think the place is perfectly ok because although we are talking
about a core lib change motivation for it is usage. So first those
who would be using the new idiom should have their say IMHO.

There was a time when we had more people chiming into
discussions like this. Maybe people turned away because they found
the content of the list has changed in a way that it makes the list
less valuable or interesting for them. Or they have a hard time
finding those gems under a pile of other threads.

Dirk, I am glad that once again there is an interesting discussion.
Thanks for that!

Thanks a lot for your comments.

You’re welcome!

Kind regards

robert

Dirk_T · January 30, 2008, 8:25pm

Am 30 Jan 2008 um 23:20 hat tho_mica_l geschrieben:

Intuitively for me
hash.keys {|k,v| v > 200 }

#keys should return the hash’s keys, IMHO it shouldn’t iterate over
these keys.

How could you return the keys without iterating over them?

If you want to reject certain keys, call reject on these keys
as returned. The phrase “reject certain keys” put into the right
order would be in ruby lingo: hash.keys.reject {|k| hash[k] > 200}

Yes and there are some other possibilities to achieve the same.
But this is not the point.

If you think these methods are valuable, Enumerable is an open
module and you can add new methods at any time.

I know that! How can one program in Ruby and not know that classes
and modules are open?

This proposal is about the extension of two methods that are already in
the standard lib (in Hash, not Enumerable).
I’m arguing in favour of this extension of these Hash methods because
for me, it would be the Ruby way to solve for example the following
problem:

“Give me the keys of the hash with a value over 200.”

hash.keys {|k,v| v > 200 }

Sure can this be solved in several other ways, but I think this would
be the most direct and intuitive way. For me the Ruby way.
Therefore and because of the other reasons I gave in another mail in
this thread I think this would be avaluable addition to the standard
lib.

Please, everybody post your opinions. I would especially be happy if
somebody would express some support to my position…

Dirk

Dirk_T · January 30, 2008, 3:21pm

Intuitively for me
hash.keys {|k,v| v > 200 }

#keys should return the hash’s keys, IMHO it shouldn’t iterate over
these keys. If you want to reject certain keys, call reject on these
keys as returned. The phrase “reject certain keys” put into the right
order would be in ruby lingo: hash.keys.reject {|k| hash[k] > 200}

If you think these methods are valuable, Enumerable is an open module
and you can add new methods at any time.

Thomas.

Dirk_T · January 30, 2008, 9:30pm

“Give me the keys of the hash with a value over 200.”

hash.keys {|k,v| v > 200 }

Well, from my POV this is rather a special case of a more general
method
somewhere between inject and find_all that IMHO is missing somehow,
i.e.
a find_all that allow users to define the value to be stored – so
it’s
actually inject()?

Sure can this be solved in several other ways, but I think this would
be the most direct and intuitive way.

Apart from the fact that my opinion on this doesn’t matter, I’m
slightly
concerned with a core/stdlib coded in C growing too large/complex.

Please, everybody post your opinions. I would especially be happy if
somebody would express some support to my position…

After all you only have to convince one person since there is no ruby
committee I would know of.

How could you return the keys without iterating over them?

Uhm … well … BTW the code you’re looking for is in hash.c:

static VALUE
rb_hash_keys(VALUE hash)
{
    VALUE ary;

    ary = rb_ary_new();
    rb_hash_foreach(hash, keys_i, ary);

    return ary;
}

Regards,
Thomas.