Hash.keys and hash.values

Mage · August 15, 2006, 2:19pm

Hello,

I know the order of the keys of a hash is arbitary. However, has the
hash.keys and hash.values same order? Will they be consistent with each
other?

I found nothing in the manual about this. The @Agile Web Developing"
book uses the form below:

Users.update(params[:user].keys,params[:user].values)

So, I think, the order of the keys and values should be same. If it is
then it should be documented, too.

If I missed something, excuse me.

   Mage

Mage · August 15, 2006, 2:19pm

Mage wrote:

So, I think, the order of the keys and values should be same. If it is
then it should be documented, too.

If I missed something, excuse me.

They’ll certainly be the same order. It would be crazy otherwise.

This perhaps should be mentioned somewhere, but maybe people
take it for granted.

Hal

Mage · August 15, 2006, 2:19pm

On Mon, Aug 14, 2006 at 07:31:05AM +0900, Hal F. wrote:

They’ll certainly be the same order. It would be crazy otherwise.

Not necessarily. It could be that, when calling on them separately,
they are accessed in a different order – and only when using one to
reference the other do their associations come to light. It’s
conceivable where there might come a time when someone decides, for
implementation reasons, that this might be a better way to handle it.
It’s still less surprising for them to be accessed in the same order,
though, so it’s nice that it’s implemented this way in absence of a
compelling reason to do otherwise.

This perhaps should be mentioned somewhere, but maybe people
take it for granted.

I suspect people don’t even get as far as taking it for granted, since
they probably tend to access one from the other in almost all cases.

Mage · August 15, 2006, 2:19pm

On 8/13/06, Chad P. [email protected] wrote:

On Mon, Aug 14, 2006 at 07:31:05AM +0900, Hal F. wrote:

They’ll certainly be the same order. It would be crazy otherwise.
Not necessarily. It could be that, when calling on them separately,
they are accessed in a different order – and only when using one to
reference the other do their associations come to light. It’s
conceivable where there might come a time when someone decides, for
implementation reasons, that this might be a better way to handle it.
It’s still less surprising for them to be accessed in the same order,
though, so it’s nice that it’s implemented this way in absence of a
compelling reason to do otherwise.

They would be the same order so long as no changes have been made to
the internal hash table. So, “yes unless you modify the hash or
you’re in threaded code which might modify the hash.”

-austin

Mage · August 15, 2006, 2:19pm

Erik V. wrote:

If you really want to be sure that they’re in the same order:

keys, values = hsh.to_a.transpose

Thank you, I like to be 100% sure, so it’s comfortable.

      Mage

Mage · August 15, 2006, 2:19pm

Chad P. wrote:

I suspect people don’t even get as far as taking it for granted, since
they probably tend to access one from the other in almost all cases.

On the second week of my Ruby “experience” I found myself writing a
method (for a very basic db layer) where hash.keys and hash.values could
come in play.
So I think it’s a natural need. After a short Google session I found
others dealing with this, most of them assumed that they have same
order.

Based on the answers of this thread I believe they are, however until it
becomes documented I will use hash.to_a.transpose in production
environment.

      Mage

Mage · August 15, 2006, 2:19pm

On 14.08.2006 17:06, Mage wrote:

Based on the answers of this thread I believe they are, however until it
becomes documented I will use hash.to_a.transpose in production
environment.

Here’s another solution - maybe it’s even more efficient as it doesn’t
need the transposing:

hash={1=>2,3=>4}
=> {1=>2, 3=>4}

keys,vals = hash.inject([[],[]]) {|(ks,vs),(k,v)| [ks << k, vs <<
v]}
=> [[1, 3], [2, 4]]

keys
=> [1, 3]

vals
=> [2, 4]

… of course using #inject.

Kind regards

robert

Mage · August 15, 2006, 2:19pm

If you really want to be sure that they’re in the same order:

keys, values = hsh.to_a.transpose

gegroet,
Erik V. - http://www.erikveen.dds.nl/

Mage · August 15, 2006, 2:19pm

Robert K. wrote:

hash={1=>2,3=>4}
=> {1=>2, 3=>4}

keys,vals = hash.inject([[],[]]) {|(ks,vs),(k,v)| [ks << k, vs << v]}
=> [[1, 3], [2, 4]]

keys
=> [1, 3]

vals
=> [2, 4]
Thank you, maybe I will benchmark this at home against the transpose
solution.

      Mage

Mage · August 15, 2006, 2:20pm

On Tue, Aug 15, 2006 at 12:06:51AM +0900, Mage wrote:

Chad P. wrote:

I suspect people don’t even get as far as taking it for granted, since
they probably tend to access one from the other in almost all cases.

On the second week of my Ruby “experience” I found myself writing a
method (for a very basic db layer) where hash.keys and hash.values could
come in play.
So I think it’s a natural need. After a short Google session I found
others dealing with this, most of them assumed that they have same order.

I stand corrected.

Based on the answers of this thread I believe they are, however until it
becomes documented I will use hash.to_a.transpose in production environment.

That’s probably an excellent policy.

Mage · August 15, 2006, 2:20pm

Erik V. wrote:

Here’s another solution - maybe it’s even more efficient as
it doesn’t need the transposing:

Well, a quick benchmark… No, it isn’t more efficient. Neither
time-wise, nor memory-wise.

Your test doesn’t prove higher memory usage for the inject version. The
problem with the transpose approach is that you need an additional copy
of the complete hash in mem which is not needed for inject. However,
inject version is likely to cause more GC because of all the two element
arrays created. You could retest this:

keys, vals = [], []
hash.each {|k,v| keys << k; vals << v}

This should be the most efficient approach - memory and performance
wise.

Kind regards

robert

Mage · August 15, 2006, 2:20pm

Here’s another solution - maybe it’s even more efficient as
it doesn’t need the transposing:

Well, a quick benchmark… No, it isn’t more efficient. Neither
time-wise, nor memory-wise.

gegroet,
Erik V. - http://www.erikveen.dds.nl/

$ cat test.rb
require “ev/ruby”

GC.disable

hash = {}

1000.times do |n|
hash[n] = n*n
end

case ARGV.shift
when “transpose”
bm do
1000.times do
keys, values = hash.to_a.transpose
end
end
when “inject”
bm do
1000.times do
keys, values = hash.inject([[],[]]){|(ks,vs),(k,v)| [ks <<
k, vs << v]}
end
end
else
raise “uh?”
end

puts meminfo

$ ruby test.rb transpose
VmSize: 58284 kB
CPU ELAPSED COUNT CPU/COUNT LABEL
1.020000 1.153397 1 1.020000 “test.rb:13”

$ ruby test.rb inject
VmSize: 146676 kB
CPU ELAPSED COUNT CPU/COUNT LABEL
2.700000 2.826171 1 2.700000 “test.rb:19”

Mage · August 15, 2006, 2:20pm

performance wise.
Hihi… ;]

Remember, both to_a and transpose are implemented in C.

gegroet,
Erik V. - http://www.erikveen.dds.nl/

PS: The circumstances are a bit different, compared to my
previous post. Different environment, different numbers.

$ cat test.rb
require “ev/ruby”

GC.disable

hash = {}

1000.times do |n|
hash[n] = n*n
end

case ARGV.shift
when “traditional”
bm do
1000.times do
keys = hash.keys
values = hash.values
end
end
when “transpose”
bm do
1000.times do
keys, values = hash.to_a.transpose
end
end
when “inject”
bm do
1000.times do
keys, values = hash.inject([[],[]]){|(ks,vs),(k,v)| [ks <<
k, vs << v]}
end
end
when “inject2”
bm do
1000.times do
keys, vals = [], []
hash.each {|k,v| keys << k; vals << v}
end
end
else
raise “uh?”
end

puts meminfo

$ ruby test.rb traditional
VmSize: 14296 kB
CPU ELAPSED COUNT CPU/COUNT LABEL
0.300000 0.310413 1 0.300000 “test.rb:13”

$ ruby test.rb transpose
VmSize: 58280 kB
CPU ELAPSED COUNT CPU/COUNT LABEL
0.760000 0.861536 1 0.760000 “test.rb:20”

$ ruby test.rb inject
VmSize: 146676 kB
CPU ELAPSED COUNT CPU/COUNT LABEL
3.190000 3.331201 1 3.190000 “test.rb:26”

$ ruby test.rb inject2
VmSize: 55244 kB
CPU ELAPSED COUNT CPU/COUNT LABEL
1.690000 1.819222 1 1.690000 “test.rb:32”

Mage · August 15, 2006, 2:20pm

On 14.08.2006 21:01, Erik V. wrote:

performance wise.

Hihi… ;]

Remember, both to_a and transpose are implemented in C.

Yes, right. Still some remarks

by not using GC you will see the memory used in total but this does
not give a realistic result because it omits GC times and it does make a
difference whether you only allocate small chunks and release them again
or whether you need a copy of the whole data structure in memory, so
memory wise approach “each” is the most efficient
“traditional” is not semantically identical to the other approaches
because it’s not guaranteed that both are in the same order

Some more figures:

08:58:59 [Temp]: ./test2.rb
Rehearsal -----------------------------------------------
traditional 1.375000 0.000000 1.375000 ( 1.375000)
transpose 11.203000 0.000000 11.203000 ( 11.203000)
inject 45.188000 0.000000 45.188000 ( 45.187000)
each 20.140000 0.000000 20.140000 ( 20.157000)
------------------------------------- total: 77.906000sec

               user     system      total        real

traditional 1.391000 0.000000 1.391000 ( 1.390000)
transpose 11.281000 0.000000 11.281000 ( 11.282000)
inject 45.219000 0.000000 45.219000 ( 45.234000)
each 19.813000 0.000000 19.813000 ( 19.812000)

Cheers

robert

Mage · August 15, 2006, 10:45pm

Well, I like Ruby’s community.

Thank you all for your time.

   Mage