Forum: Ruby Hash keys don't work as expected

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
DK (Guest)
on 2007-03-02 19:55
(Received via mailing list)
Hello.  I am new to Ruby.  I am more familiar with Python, whose hash
keys are immutable, so I was surprised that Ruby's were not.  When I
try to retrieve a value from a hash after manipulating the associated
key, it returns nil, even though it admits that the key in the hash
and the key I'm passing into the hash's [] method are the same:

irb(main):001:0> a=[1,2,3]
=> [1, 2, 3]
irb(main):002:0> b={a=>'test'}
=> {[1, 2, 3]=>"test"}
irb(main):003:0> b[a]
=> "test"
irb(main):004:0> a[0]=7
=> 7
irb(main):005:0> b[a]
=> nil
irb(main):006:0> b.keys[0] == a
=> true
irb(main):007:0> b
=> {[7, 2, 3]=>"test"}

# This is doubly weird:

irb(main):008:0> b[b.keys[0]]
=> nil

I'm sure someone has run into this before, but I was unable to find
anything on it.  I am using ruby 1.8.5.

Thanks,

DK
Tim H. (Guest)
on 2007-03-02 19:58
DK wrote:
> Hello.  I am new to Ruby.  I am more familiar with Python, whose hash
> keys are immutable, so I was surprised that Ruby's were not.  When I
> try to retrieve a value from a hash after manipulating the associated
> key, it returns nil, even though it admits that the key in the hash
> and the key I'm passing into the hash's [] method are the same:
>
> irb(main):001:0> a=[1,2,3]
> => [1, 2, 3]
> irb(main):002:0> b={a=>'test'}
> => {[1, 2, 3]=>"test"}
> irb(main):003:0> b[a]
> => "test"
> irb(main):004:0> a[0]=7
> => 7
> irb(main):005:0> b[a]
> => nil
> irb(main):006:0> b.keys[0] == a
> => true
> irb(main):007:0> b
> => {[7, 2, 3]=>"test"}
>
> # This is doubly weird:
>
> irb(main):008:0> b[b.keys[0]]
> => nil
>
> I'm sure someone has run into this before, but I was unable to find
> anything on it.  I am using ruby 1.8.5.
>
> Thanks,
>
> DK

ri Hash#rehash
------------------------------------------------------------ Hash#rehash
     hsh.rehash -> hsh
------------------------------------------------------------------------
     Rebuilds the hash based on the current hash values for each key. If
     values of key objects have changed since they were inserted, this
     method will reindex _hsh_. If +Hash#rehash+ is called while an
     iterator is traversing the hash, an +IndexError+ will be raised in
     the iterator.

        a = [ "a", "b" ]
        c = [ "c", "d" ]
        h = { a => 100, c => 300 }
        h[a]       #=> 100
        a[0] = "z"
        h[a]       #=> nil
        h.rehash   #=> {["z", "b"]=>100, ["c", "d"]=>300}
        h[a]       #=> 100
DK (Guest)
on 2007-03-02 20:11
(Received via mailing list)
On Mar 2, 12:58 pm, Tim H. <removed_email_address@domain.invalid> wrote:
> > => {[1, 2, 3]=>"test"}
>
> > DK
>
> Posted viahttp://www.ruby-forum.com/.
Ah, thank you.  Why is this necessary?  Isn't the lookup just a matter
of matching object_id's?
Tim H. (Guest)
on 2007-03-02 20:28
DK wrote:
> On Mar 2, 12:58 pm, Tim H. <removed_email_address@domain.invalid> wrote:
>> > => {[1, 2, 3]=>"test"}
>>
>> > DK
>>
>> Posted viahttp://www.ruby-forum.com/.
> Ah, thank you.  Why is this necessary?  Isn't the lookup just a matter
> of matching object_id's?

Remember that hash values are distributed into buckets based on their
key. Change the key, change the bucket.
Robert K. (Guest)
on 2007-03-02 20:56
(Received via mailing list)
On 02.03.2007 19:28, Tim H. wrote:
> DK wrote:
>> On Mar 2, 12:58 pm, Tim H. <removed_email_address@domain.invalid> wrote:
>>>> => {[1, 2, 3]=>"test"}
>>>> DK
>>> Posted viahttp://www.ruby-forum.com/.
>> Ah, thank you.  Why is this necessary?  Isn't the lookup just a matter
>> of matching object_id's?
>
> Remember that hash values are distributed into buckets based on their
> key. Change the key, change the bucket.

More precisely if the hash of the key changes the bucket changes.  And
since Array calculates it's hash based on the content changing the
content means changing the key.  And, to the OP, lookups in Hashes are
not done via #object_id but via #hash:
http://en.wikipedia.org/wiki/Hashtable

Kind regards

  robert
Robert D. (Guest)
on 2007-03-02 21:10
(Received via mailing list)
On 3/2/07, Robert K. <removed_email_address@domain.invalid> wrote:
> > key. Change the key, change the bucket.
>
> More precisely if the hash of the key changes the bucket changes.  And
> since Array calculates it's hash based on the content changing the
> content means changing the key.  And, to the OP, lookups in Hashes are
> not done via #object_id but via #hash:
> http://en.wikipedia.org/wiki/Hashtable
which all is necessary in order to have mutable keys, I guess that it
is very difficult to think about that in the beginning when not yet
used to it.

BTW. I try to avoid to mutate hash keys, I know my limits ;)

Cheers
Robert
DK (Guest)
on 2007-03-02 22:15
(Received via mailing list)
> > More precisely if the hash of the key changes the bucket changes.  And
> > since Array calculates it's hash based on the content changing the
> > content means changing the key.  And, to the OP, lookups in Hashes are
> > not done via #object_id but via #hash:
> >http://en.wikipedia.org/wiki/Hashtable
>
> which all is necessary in order to have mutable keys, I guess that it
> is very difficult to think about that in the beginning when not yet
> used to it.

I understand now.  I can see how looking up keys by their hash rather
than their object_id is more appropriate.

> BTW. I try to avoid to mutate hash keys, I know my limits ;)

Agreed.  I was just curious about how things worked underneath.

Thanks for all the responses.

-DK
Rick D. (Guest)
on 2007-03-03 00:38
(Received via mailing list)
On 3/2/07, Robert D. <removed_email_address@domain.invalid> wrote:
> On 3/2/07, Robert K. <removed_email_address@domain.invalid> wrote:
> > More precisely if the hash of the key changes the bucket changes.  And
> > since Array calculates it's hash based on the content changing the
> > content means changing the key.  And, to the OP, lookups in Hashes are
> > not done via #object_id but via #hash:
> > http://en.wikipedia.org/wiki/Hashtable
> which all is necessary in order to have mutable keys, I guess that it
> is very difficult to think about that in the beginning when not yet
> used to it.

Actually it doesn't spring directly from supporting mutable keys, but
from wanting to have the keys match based on equality rather than
identity.

If Hash used identity instead of hash then

a = "key".freeze  # make the key immutable

h = {a => 1}

h["key"] ==> nil

even though nothing got mutated.

> BTW. I try to avoid to mutate hash keys, I know my limits ;)

Yes if you really want to be safe you can always clone and freeze keys
when inserting into a hash.  But that's probably overkill.
--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/
Robert D. (Guest)
on 2007-03-03 01:13
(Received via mailing list)
On 3/2/07, Rick DeNatale <removed_email_address@domain.invalid> wrote:
<snip>.
>
I missed that one too :)
Good observation, how is this done in Python than?
<snip>> --
Robert K. (Guest)
on 2007-03-03 11:11
(Received via mailing list)
On 02.03.2007 23:37, Rick DeNatale wrote:

Btw, there is an optimization going on under the hoods: unfrozen Strings
are duped on onsertion:

irb(main):006:0> k="xx"
=> "xx"
irb(main):007:0> h={k=>1}
=> {"xx"=>1}
irb(main):008:0> k.object_id
=> 1881810
irb(main):009:0> h.keys[0].object_id
=> 1881830

irb(main):010:0> k.freeze
=> "xx"
irb(main):011:0> h={k=>1}
=> {"xx"=>1}
irb(main):012:0> k.object_id
=> 1881810
irb(main):013:0> h.keys[0].object_id
=> 1881810

>> BTW. I try to avoid to mutate hash keys, I know my limits ;)
>
> Yes if you really want to be safe you can always clone and freeze keys
> when inserting into a hash.  But that's probably overkill.

Cloning is not necessary for Strings (see above).

Kind regards

  robert
DK (Guest)
on 2007-03-05 23:51
(Received via mailing list)
> Good observation, how is this done in Python than?

I don't know what goes on under the hood.  Because the keys are
immutable, I don't imagine things like freezing and rehashing are
necessary, but I really don't know how the keys are looked up.

-DK
This topic is locked and can not be replied to.