Forum: Ruby Hash with array as value type

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
577dd5c647eac5262ff8fc08e6143d3e?d=identicon&s=25 Michael McGreevy (Guest)
on 2006-01-26 12:22
Hello all,

I would like to store arrays in a hash, indexed by a string key.  I
would like to have the hash create an empty array as the default value,
when it sees a new key, something like the code involving "hash1" below.
However, this code is giving some strange results -- it claims that the
hash is empty, even though there is an array stored in it, and I can
then retrieve that array.

I am new to ruby, so maybe I am just doing something stupid (... I am
not sure about that "Hash.new( [] )" for example... )

Can anyone explain these results?

Thanks.

CODE:

#!/usr/bin/ruby


hash1 = Hash.new( [] )
hash1["hello"].push(1.0)
hash1["hello"].push(2.0)
$stderr.write "hash1.size() = #{hash1.size()}\n"
$stderr.write "hash1.empty() = #{hash1.empty?()}\n"
$stderr.write "hash1[\"hello\"].size() = #{hash1["hello"].size()}\n"
$stderr.write "hash1[\"hello\"] = #{hash1["hello"]}\n"

hash2 = {"hello" => [1.0,2.0]}
$stderr.write "\nhash2.size() = #{hash2.size()}\n"
$stderr.write "hash2.empty() = #{hash2.empty?()}\n"
$stderr.write "hash2[\"hello\"].size() = #{hash2["hello"].size()}\n"
$stderr.write "hash2[\"hello\"] = #{hash2["hello"]}\n"


OUTPUT:

hash1.size() = 0
hash1.empty() = true
hash1["hello"].size() = 2
hash1["hello"] = 1.02.0

hash2.size() = 1
hash2.empty() = false
hash2["hello"].size() = 2
hash2["hello"] = 1.02.0




VERSION:
ruby 1.8.3 (2005-09-21) [i686-linux]
357558a6682f4d6624594763d9acdb35?d=identicon&s=25 Mike Fletcher (fletch)
on 2006-01-26 14:19
Michael McGreevy wrote:
> Hello all,
>
> I would like to store arrays in a hash, indexed by a string key.  I
> would like to have the hash create an empty array as the default value,
> when it sees a new key, something like the code involving "hash1" below.
> However, this code is giving some strange results -- it claims that the
> hash is empty, even though there is an array stored in it, and I can
> then retrieve that array.
>
> I am new to ruby, so maybe I am just doing something stupid (... I am
> not sure about that "Hash.new( [] )" for example... )
>
> Can anyone explain these results?
>
> Thanks.
>
> CODE:
>
> #!/usr/bin/ruby
>
>
> hash1 = Hash.new( [] )

This makes a new Hash with the default value being the Array reference
you've passed it.  This is only executed *once*, so each new key/value
pair points to the *same* Array instance.  You want

hash1 = Hash.new { |h,k| Array.new }

That'll make a new Array instance each time it needs a new default value
for a new key.
5befe95e6648daec3dd5728cd36602d0?d=identicon&s=25 Robert Klemme (Guest)
on 2006-01-26 14:19
(Received via mailing list)
Michael McGreevy wrote:
> not sure about that "Hash.new( [] )" for example... )
>
> Can anyone explain these results?

Yes.  You ran into the typical Hash pitfal: the default value is the one
returned if something is not found for the given key.  But it never
changes the hash and there is just this single instance.  Consider this:

>> h=Hash.new([])
=> {}
>> h[0]<<1
=> [1]
>> h[0]
=> [1]
>> h["foo"]
=> [1]
>> h["bar"] << 2
=> [1, 2]
>> h[:x]
=> [1, 2]
>> h.default
=> [1, 2]

It works for numeric values because then one usually assigns:

>> h=Hash.new(0)
=> {}
>> h[:foo] += 1
=> 1
>> h[:bar] += 10
=> 10
>> h
=> {:bar=>10, :foo=>1}

Note the "+=" contains an assignment and it's equivalent to

h[:foo] = h[:foo] + 1

You on the other hand want the block form because that can do arbitrary
things when a key is not found:

>> h=Hash.new() {|ha,key| puts "missing #{key}"; ha[key] = []}
=> {}
>> h[:foo] << "foo"
missing foo
=> ["foo"]
>> h[:bar] << "foo"
missing bar
=> ["foo"]
>> h[:foo] << "foo end"
=> ["foo", "foo end"]
>> h[:foo] << "foo more"
=> ["foo", "foo end", "foo more"]
>> h
=> {:bar=>["foo"], :foo=>["foo", "foo end", "foo more"]}

Kind regards

    robert
577dd5c647eac5262ff8fc08e6143d3e?d=identicon&s=25 Michael McGreevy (Guest)
on 2006-01-26 14:33
Robert Klemme wrote:

> You on the other hand want the block form because that can do arbitrary
> things when a key is not found:
>
>>> h=Hash.new() {|ha,key| puts "missing #{key}"; ha[key] = []}

Thanks a lot :-).  I had actually already tried the block form, but in a
way similar to what Mike Fletcher posted.  I now see that simply
returning something from the block does not modify the hash.. you have
to explicitly modify the hash within the block.


Thanks to both of you for responding.

Michael.
357558a6682f4d6624594763d9acdb35?d=identicon&s=25 Mike Fletcher (fletch)
on 2006-01-26 15:00
Mike Fletcher wrote:
[...]
> hash1 = Hash.new { |h,k| Array.new }
>
> That'll make a new Array instance each time it needs a new default value
> for a new key.

GAH, that should be

hash1 = Hash.new { | h, k | h[k] = Array.new }

MEMO TO SELF: Don't post in the mornings before caffeine has chance to
take effect.
E7559e558ececa67c40f452483b9ac8c?d=identicon&s=25 unknown (Guest)
on 2006-01-26 15:05
(Received via mailing list)
On Jan 26, 2006, at 6:22 AM, Michael McGreevy wrote:
> I am new to ruby, so maybe I am just doing something stupid (... I am
> not sure about that "Hash.new( [] )" for example... )
> Can anyone explain these results?

It is a bit confusing.

Hash.new([]) tucks away the newly created array as the default value:

d = []
puts d.object_id
hash1 = Hash.new(d)
puts hash1.default.object_id

You can see from this that Hash has saved a reference to the default
value.
So when you call:

   hash1["hello"].push(1.0)

on the empty hash, a reference to the default value is returned because
the key is not found.  The float, 1.0, is pushed onto that default
value.
You still haven't entered anything into the Hash itself but you have
pushed
a value into the default array:

puts hash1.default

This same default object will be returned each time a lookup fails in
the Hash.

The solution is to use the block form of the Hash constructor.

   hash1 = Hash.new { |h,k| h[k] = [] }

In this form, every lookup miss causes the block to be called with
the hash
and the key as the two arguments.  The block allocates a new array
and then
stores it back into the hash using the key so that the next lookup
finds the
newly allocated array and doesn't call the block.

Here is your example rewritten with that and adjusted to follow the
usual ruby coding styles:

hash1 = Hash.new { |h,k| h[k] = [] }
hash1["hello"].push(1.0)
hash1["hello"].push(2.0)
warn("hash1.size = #{hash1.size}")
warn("hash1.empty? = #{hash1.empty?}")
warn("hash1[\"hello\"].size = #{hash1["hello"].size}")
warn("hash1[\"hello\"] = #{hash1["hello"]}")
warn("")

hash2 = {"hello" => [1.0,2.0]}
warn("hash2.size = #{hash2.size}")
warn("hash2.empty = #{hash2.empty?}")
warn("hash2[\"hello\"].size = #{hash2["hello"].size}")
warn("hash2[\"hello\"] = #{hash2["hello"]}")


Gary Wright
2cf6d8e639314abd751f83a72e9a2ac5?d=identicon&s=25 Martin DeMello (Guest)
on 2006-01-26 15:50
(Received via mailing list)
Robert Klemme <bob.news@gmx.net> wrote:
> > I am new to ruby, so maybe I am just doing something stupid (... I am
> > not sure about that "Hash.new( [] )" for example... )
> >
> > Can anyone explain these results?
>
> Yes.  You ran into the typical Hash pitfal: the default value is the one
> returned if something is not found for the given key.  But it never
> changes the hash and there is just this single instance.  Consider this:

I find it instructive to think of the default as a #key_missing method,
analogous to #method_missing.

martin
This topic is locked and can not be replied to.