The reason behind introducing insertion order in hashes in 1.9.0


I’m doing a research for a blogpost. I want to find out what was the
reason behind adding insertion order to hashes in Ruby 1.9.0.

My findings led me to a thread on ruby-list. [1] It’s in Japanese and
unfortunately, I can’t read it. Translating Matz’ messages using Google
Translate makes me believe it contains some useful information.

Ilya G. in his “Ruby 1.9 Internals: Ordered Hash” article [2]
mentions that “Ruby mailing list archive is full of heated discussions
on whether a Ruby hash should be an ordered hash”. I think it might be
one of the threads he had in mind.

I’d be very thankful if somebody could tell me if it’s even the thread
I’m looking for and extract key points from it.


The reason is simplicity (IMHO):

A hash (associative map) has the promise that the lookup of an
individual key costs around O(1) time. To achieve this, there must be an
indexing ‘layer’ that is coupled to the data layer in the map (same way
as in a database, -> data + index = fast lookup).
When looking up an element, the engine searches the index, then returns
the referenced element.
On the other hand none of the hash implementations (in any language)
guarantees the order of elements (because a Hash is really a Set and not
an Array, but Ruby has no Set class by default).
So, instead of creating an ‘indexing’ layer beside the data layer
(effectively doubling the memory requirement), Ruby 1.9 creates a
“linked list” from the hash entries, saving memory, but complicating
adding and deleting of elements. And the result is that if you wrote (b,
a, d, c) then then order of these elements will be preserved.

Not to mention that in Ruby, any object can be a Key, honestly I don’t
even know how Ruby 1.8 ordered them at all (aside from object_ID), like
{ [3,5]=>“a”, :z=>“b”, {:c=>“b”} => “f” }