Hash with array as value type


#1

Hello all,

I would like to store arrays in a hash, indexed by a string key. I
would like to have the hash create an empty array as the default value,
when it sees a new key, something like the code involving “hash1” below.
However, this code is giving some strange results – it claims that the
hash is empty, even though there is an array stored in it, and I can
then retrieve that array.

I am new to ruby, so maybe I am just doing something stupid (… I am
not sure about that “Hash.new( [] )” for example… )

Can anyone explain these results?

Thanks.

CODE:

#!/usr/bin/ruby

hash1 = Hash.new( [] )
hash1[“hello”].push(1.0)
hash1[“hello”].push(2.0)
$stderr.write “hash1.size() = #{hash1.size()}\n”
$stderr.write “hash1.empty() = #{hash1.empty?()}\n”
$stderr.write “hash1[“hello”].size() = #{hash1[“hello”].size()}\n”
$stderr.write “hash1[“hello”] = #{hash1[“hello”]}\n”

hash2 = {“hello” => [1.0,2.0]}
$stderr.write “\nhash2.size() = #{hash2.size()}\n”
$stderr.write “hash2.empty() = #{hash2.empty?()}\n”
$stderr.write “hash2[“hello”].size() = #{hash2[“hello”].size()}\n”
$stderr.write “hash2[“hello”] = #{hash2[“hello”]}\n”

OUTPUT:

hash1.size() = 0
hash1.empty() = true
hash1[“hello”].size() = 2
hash1[“hello”] = 1.02.0

hash2.size() = 1
hash2.empty() = false
hash2[“hello”].size() = 2
hash2[“hello”] = 1.02.0

VERSION:
ruby 1.8.3 (2005-09-21) [i686-linux]


#2

Michael McGreevy wrote:

Hello all,

I would like to store arrays in a hash, indexed by a string key. I
would like to have the hash create an empty array as the default value,
when it sees a new key, something like the code involving “hash1” below.
However, this code is giving some strange results – it claims that the
hash is empty, even though there is an array stored in it, and I can
then retrieve that array.

I am new to ruby, so maybe I am just doing something stupid (… I am
not sure about that “Hash.new( [] )” for example… )

Can anyone explain these results?

Thanks.

CODE:

#!/usr/bin/ruby

hash1 = Hash.new( [] )

This makes a new Hash with the default value being the Array reference
you’ve passed it. This is only executed once, so each new key/value
pair points to the same Array instance. You want

hash1 = Hash.new { |h,k| Array.new }

That’ll make a new Array instance each time it needs a new default value
for a new key.


#3

Michael McGreevy wrote:

not sure about that “Hash.new( [] )” for example… )

Can anyone explain these results?

Yes. You ran into the typical Hash pitfal: the default value is the one
returned if something is not found for the given key. But it never
changes the hash and there is just this single instance. Consider this:

h=Hash.new([])
=> {}

h[0]<<1
=> [1]

h[0]
=> [1]

h[“foo”]
=> [1]

h[“bar”] << 2
=> [1, 2]

h[:x]
=> [1, 2]

h.default
=> [1, 2]

It works for numeric values because then one usually assigns:

h=Hash.new(0)
=> {}

h[:foo] += 1
=> 1

h[:bar] += 10
=> 10

h
=> {:bar=>10, :foo=>1}

Note the “+=” contains an assignment and it’s equivalent to

h[:foo] = h[:foo] + 1

You on the other hand want the block form because that can do arbitrary
things when a key is not found:

h=Hash.new() {|ha,key| puts “missing #{key}”; ha[key] = []}
=> {}

h[:foo] << “foo”
missing foo
=> [“foo”]

h[:bar] << “foo”
missing bar
=> [“foo”]

h[:foo] << “foo end”
=> [“foo”, “foo end”]

h[:foo] << “foo more”
=> [“foo”, “foo end”, “foo more”]

h
=> {:bar=>[“foo”], :foo=>[“foo”, “foo end”, “foo more”]}

Kind regards

robert

#4

Robert K. wrote:

You on the other hand want the block form because that can do arbitrary
things when a key is not found:

h=Hash.new() {|ha,key| puts “missing #{key}”; ha[key] = []}

Thanks a lot :-). I had actually already tried the block form, but in a
way similar to what Mike F. posted. I now see that simply
returning something from the block does not modify the hash… you have
to explicitly modify the hash within the block.

Thanks to both of you for responding.

Michael.


#5

Mike F. wrote:
[…]

hash1 = Hash.new { |h,k| Array.new }

That’ll make a new Array instance each time it needs a new default value
for a new key.

GAH, that should be

hash1 = Hash.new { | h, k | h[k] = Array.new }

MEMO TO SELF: Don’t post in the mornings before caffeine has chance to
take effect.


#6

On Jan 26, 2006, at 6:22 AM, Michael McGreevy wrote:

I am new to ruby, so maybe I am just doing something stupid (… I am
not sure about that “Hash.new( [] )” for example… )
Can anyone explain these results?

It is a bit confusing.

Hash.new([]) tucks away the newly created array as the default value:

d = []
puts d.object_id
hash1 = Hash.new(d)
puts hash1.default.object_id

You can see from this that Hash has saved a reference to the default
value.
So when you call:

hash1[“hello”].push(1.0)

on the empty hash, a reference to the default value is returned because
the key is not found. The float, 1.0, is pushed onto that default
value.
You still haven’t entered anything into the Hash itself but you have
pushed
a value into the default array:

puts hash1.default

This same default object will be returned each time a lookup fails in
the Hash.

The solution is to use the block form of the Hash constructor.

hash1 = Hash.new { |h,k| h[k] = [] }

In this form, every lookup miss causes the block to be called with
the hash
and the key as the two arguments. The block allocates a new array
and then
stores it back into the hash using the key so that the next lookup
finds the
newly allocated array and doesn’t call the block.

Here is your example rewritten with that and adjusted to follow the
usual ruby coding styles:

hash1 = Hash.new { |h,k| h[k] = [] }
hash1[“hello”].push(1.0)
hash1[“hello”].push(2.0)
warn(“hash1.size = #{hash1.size}”)
warn(“hash1.empty? = #{hash1.empty?}”)
warn(“hash1[“hello”].size = #{hash1[“hello”].size}”)
warn(“hash1[“hello”] = #{hash1[“hello”]}”)
warn("")

hash2 = {“hello” => [1.0,2.0]}
warn(“hash2.size = #{hash2.size}”)
warn(“hash2.empty = #{hash2.empty?}”)
warn(“hash2[“hello”].size = #{hash2[“hello”].size}”)
warn(“hash2[“hello”] = #{hash2[“hello”]}”)

Gary W.


#7

Robert K. removed_email_address@domain.invalid wrote:

I am new to ruby, so maybe I am just doing something stupid (… I am
not sure about that “Hash.new( [] )” for example… )

Can anyone explain these results?

Yes. You ran into the typical Hash pitfal: the default value is the one
returned if something is not found for the given key. But it never
changes the hash and there is just this single instance. Consider this:

I find it instructive to think of the default as a #key_missing method,
analogous to #method_missing.

martin