Why Does Hash Apparently Reorder Its Internal Representation

I was surprised to find the following…

irb(main):001:0> {:a => ‘a’, :b => ‘b’}
=> {:a=>“a”, :b=>“b”}
irb(main):002:0> {:a => ‘a’, :b => ‘b’, :c => ‘c’}
=> {:a=>“a”, :b=>“b”, :c=>“c”}
irb(main):003:0> {:a => ‘a’, :b => ‘b’, :c => ‘c’, :d => ‘d’}
=> {:a=>“a”, :d=>“d”, :b=>“b”, :c=>“c”}
irb(main):004:0> {:a => ‘a’, :b => ‘b’, :c => ‘c’, :d => ‘d’, :e =>
‘e’}
=> {:a=>“a”, :d=>“d”, :b=>“b”, :e=>“e”, :c=>“c”}
irb(main):005:0> {:a => ‘a’, :b => ‘b’, :c => ‘c’, :d => ‘d’, :e =>
‘e’, :f => ‘f’}
=> {:a=>“a”, :d=>“d”, :b=>“b”, :e=>“e”, :c=>“c”, :f=>“f”}

I expected that the sequence of hashes returned would be identical in
order to the presented sequence. Or, that there would be some
discernable sequence such as if a stack, if not as a queue.

I realise that it is a hash and that access is expected to be by key,
but sometimes retaining the presented order may be desired. And
anyway, why reorder it at all?

Furthermore, the returned order is different from a previous sequence
I did! So, it appears that this is somewhat random, however unlikely
that it is.

I then assumed that there might be some kind of optimisation going on,
but how much optimisation of {:a => ‘a’, :b => ‘b’, … } can there
be?

It doesn’t seem very PoLS to have it reordered, although perhaps one
shouldn’t be surprised that a hash is unordered? Perhaps Matz is
convincing us of this statement? Said Matz unto the flock in a loud,
Godly voice: “Make no assumptions about the order of hashes!”

And would eval %{ instance_of_hash[instance_of_fixnum] } really be so
evil? Perhaps that was a little obscure… What’s wrong with ordered
access, using a numeric element reference as with Array, to Hash?
Excepting that the order can’t be relied upon, but assume that it
could. Even simpler might have been to give an example:

h = {:a => ‘a’, :b => ‘b’}
h[0]
=> ‘a’

Similarly one might be able to treat an Array like a Hash as in eval
%{ instance_of_array[‘instance_of_string_representation_of_a_fixnum’]
}, such as with:

a = [ 1, 2 ]
a[‘0’]
=> 1

There’s no great call for the immediately above I would think, but if
I implemented one then I would implement the other also, simply for
the purposes of symmetry. I’m not even sure there is any need for
either, such that I may be trying to achieve the unnecessary?..

Even so, would some unwritten law be being broken if I did this stuff?

On 8/21/06, thoran @ thoran. com [email protected] wrote:

I was surprised to find the following…

Really? Take a comsci course at uni

I realise that it is a hash and that access is expected to be by key,
but sometimes retaining the presented order may be desired. And
anyway, why reorder it at all?

It’s not ‘reordered’, it’s stored in a hash table. Think of how MD5
represents a unique key for almost every possible value, but the hash
has no real connection to the original value. A hash table is stored
using similar keys, though you are allowed duplicate keys for a hash
table, and if you go in to the mathematical theory of it there is a
lot of speed gained this way especially if you’re using large keys,
such as in i18n applications where each string is stored in a hash
table. That’s why it’s called a hash and not an array.

[email protected] wrote:

I was surprised to find the following…

[snip]

This has been discussed many times over the last
six years. Do a search of the archives.

It doesn’t seem very PoLS to have it reordered, although perhaps one
shouldn’t be surprised that a hash is unordered? Perhaps Matz is
convincing us of this statement? Said Matz unto the flock in a loud,
Godly voice: “Make no assumptions about the order of hashes!”

Don’t invoke POLS. It’s Matz’s surprise that matters, not yours or
mine. And his voice is anything but loud.

Even so, would some unwritten law be being broken if I did this stuff?

Personally I favor introducing an ordered hash of some form into Ruby.
Other people don’t. Many want the original Hash kept as it is for speed
(though I am still unconvinced that merely keeping a sequence number
along with each key would impact speed dramatically).

What might be best is a class that inherits from Hash and preserves
order. We’d need a literal notation, however.

Hal

Hi,

In message “Re: Why Does Hash Apparently Reorder Its Internal
Representation And Other Associated Ponderings”
on Mon, 21 Aug 2006 16:37:00 +0900, Hal F.
[email protected] writes:

|Personally I favor introducing an ordered hash of some form into Ruby.
|Other people don’t. Many want the original Hash kept as it is for speed
|(though I am still unconvinced that merely keeping a sequence number
|along with each key would impact speed dramatically).

I am open to introduce order into 1.9 Hash, as long as we can
accomplish reasonable performance. I haven’t yet read the “A use case
for an ordered hash” thread yet. I am facing a huge mountain of mails
after the vacation.

						matz.

On 8/21/06, thoran @ thoran. com [email protected] wrote:

I was surprised to find the following…

irb(main):001:0> {:a => ‘a’, :b => ‘b’}
=> {:a=>“a”, :b=>“b”}
[…]
I then assumed that there might be some kind of optimisation going on,
but how much optimisation of {:a => ‘a’, :b => ‘b’, … } can there
be?

Hashes are unordered so that Ruby can use the hash function of the
keys. Read up on hashes at wikipedia:

There was a thread about making an “ordered hash” in the last few
weeks. Worth searching for.

;Daniel

Hal F. wrote:

My use case (I started that thread) may not be compelling. Other people,
including Bil K., have said it would be useful to them also.

So far, I’ve managed to duplicate the functionality of two
Java codes having 834 and 1084 lines of codes with 24 and 55
lines of Ruby, respectively. The second one is so long[1]
due to the lack of ordered Hashes (and my lack of Ruby mastery).

FWIW, the Ruby versions are more robust and general than the
Java versions due to the mini DSL I wrote for expressing
tolerances in a natural way.

Thanks again for Ruby!

Regards,

Bil

[1] If you can call 55 LOC long!

Yukihiro M. wrote:

I am open to introduce order into 1.9 Hash, as long as we can
accomplish reasonable performance.

Ruby is already slow enough. Those who are griping should be
using association lists.

Yukihiro M. wrote:

I am open to introduce order into 1.9 Hash, as long as we can
accomplish reasonable performance. I haven’t yet read the “A use case
for an ordered hash” thread yet. I am facing a huge mountain of mails
after the vacation.

My use case (I started that thread) may not be compelling. Other people,
including Bil K., have said it would be useful to them also.

Bil’s example was related to NASA’s CEV. So an ordered hash would help
put people back on the moon. :wink:

Hal

William J. wrote:

Ruby is already slow enough. Those who are griping should be
using association lists.

What’s an association list?

Hal

Bil K. wrote:

Hal F. wrote:

My use case (I started that thread) may not be compelling. Other people,
including Bil K., have said it would be useful to them also.

So far, I’ve managed to duplicate the functionality of two
Java codes having 834 and 1084 lines of codes with 24 and 55
lines of Ruby, respectively. The second one is so long[1]
due to the lack of ordered Hashes

Did you try association lists?

class Array
def to_assoc
f=nil ; partition{f=!f}.transpose
end
def set k, v
(pair = assoc(k)) ? self[ index(pair) ] = [k,v] : push( [k,v] )
end
def rm k
(pair = assoc( k )) && slice!( index( pair ) )
end
end

a = [:foo,22, :bar,33, :baz,44].to_assoc
a.set( :bar, 99 )
a.set :yes, -1
a.rm :baz
p a

Hi,

In message “Re: Why Does Hash Apparently Reorder Its Internal
Representation And Other Associated Ponderings”
on Mon, 21 Aug 2006 19:26:10 +0900, “Robert D.”
[email protected] writes:

|I am just dreaming:
|
|ohhhhhh = OrderedHash.new { |k1,k2| tell_my_friend_what_order_is(k1, k2) }
|
|class MyOH < OrderedHash
| def key_order(k1, k2) ### Here we could define Mixins for common key
|orders
|
|And would asking for having two like the following be too much?

I think it’s not just “ordered” but “sorted”.

						matz.

On 8/21/06, Yukihiro M. [email protected] wrote:

|(though I am still unconvinced that merely keeping a sequence number
|along with each key would impact speed dramatically).

I am open to introduce order into 1.9 Hash, as long as we can
accomplish reasonable performance. I haven’t yet read the “A use case
for an ordered hash” thread yet. I am facing a huge mountain of mails
after the vacation.

                                                    matz.

Honestly I would prefer a subclass and Mixins allowing to define the
order
of retrieval.
As a matter of fact that is the only “order definition” that makes sense
to
me

Now as it comes to implementation of course there must be a price to
pay,
either on ordered access or on storage. So I agree with the others who
pointed out they are not willing to pay that price.
Hash is for fast lookup after all.
But having a subclass or a mixin for ordered hash access in the ruby
core
would be great!!!

I am just dreaming:

ohhhhhh = OrderedHash.new { |k1,k2| tell_my_friend_what_order_is(k1, k2)
}

class MyOH < OrderedHash
def key_order(k1, k2) ### Here we could define Mixins for common
key
orders

And would asking for having two like the following be too much?

class FastOrderdHash (key order is stored in an AST e.g.)
class SlimOrderedHash (keys are sorted on ordered access only)

just my micromoney worth

Cheers
Robert


Deux choses sont infinies : l’univers et la bêtise humaine ; en ce qui
concerne l’univers, je n’en ai pas acquis la certitude absolue.

  • Albert Einstein

On 8/21/06, Bil K. [email protected] wrote:

Sort of, but I liked the interface of Hash too much
to abandon it. So far, I am carrying along an array
of keys in order of creation for the one place that
I need it. Otherwise, I have the beauty of the stock
Hash interface at my disposal.

You can do this automatically (if you aren’t already) by creating an
object that holds a hash and an array, defines []=, each and delete to
do the right thing, and delegates missing methods to the hash.

m.

William J. wrote:

Did you try association lists?

Sort of, but I liked the interface of Hash too much
to abandon it. So far, I am carrying along an array
of keys in order of creation for the one place that
I need it. Otherwise, I have the beauty of the stock
Hash interface at my disposal.

Speed is not an issue (for me). The 5,000 simulations
I am running take days to run. Even if the Ruby I am
use to set them up takes 5 minutes instead of 5 seconds,
I’ll take the beauty of an ordered Hash over association
lists any day.

Regards,

Martin DeMello wrote:

You can do this automatically (if you aren’t already) by creating an
object that holds a hash and an array, defines []=, each and delete to
do the right thing, and delegates missing methods to the hash.

There are any number of ways to do this sort of thing.
But they all suffer from not having a convenient notation
for literals.

Hal

Martin DeMello wrote:

On 8/21/06, Bil K. [email protected] wrote:

You can do this automatically (if you aren’t already) by creating an
object that holds a hash and an array, defines []=, each and delete to
do the right thing, and delegates missing methods to the hash.

Hmmm… good idea. I’ve largely missed out on the
whole method_missing idiom. Sounds like a good use.

I’ll try to look into it after I return from JPL next
week. However, if you’d like to throw down an example,
I might be able to work it in now…

Thanks,

On 8/21/06, Yukihiro M. [email protected] wrote:

|ohhhhhh = OrderedHash.new { |k1,k2| tell_my_friend_what_order_is(k1, k2)
}
|
|class MyOH < OrderedHash
| def key_order(k1, k2) ### Here we could define Mixins for common
key
|orders
|
|And would asking for having two like the following be too much?

I think it’s not just “ordered” but “sorted”.

Sorry if I was unclear - just dreaming U know.
My naming is bad, usually, often, always?

So I was just wondering if an OrderedHash would just define the order
and
sorting would be done when necessary.
A SortedHash would keep track of the keys in a sorted way and sorted
access
would be fast.
To express better what I had in mind was (updated with your naming),
expressed in a language I talk better than English, Ruby (1)

s = SortedHash …
o = OrderdHash …

s[:fourty_two] = 1764 # internal key order will be updated
o[:fourty_two] = 1764 # nothing to be done

s.each # no presorting needed
o.each or o.each_sorted or o.each_ordered # presorting needed

Robert

(1) That does not mean I cannot talk nonsense in Ruby though :wink:

                                                    matz.


Deux choses sont infinies : l’univers et la bêtise humaine ; en ce qui
concerne l’univers, je n’en ai pas acquis la certitude absolue.

  • Albert Einstein

Martin DeMello wrote:

Quick proof of concept:

Thanks!

Later,

[email protected] wrote in message
news:2d4684b07b9f965e5078fdb46754fac3@shiny…

I was surprised to find the following…

I realise that it is a hash and that access is expected to be by key, but
sometimes retaining the presented order may be desired. And anyway, why
reorder it at all?

I am always surprised whenever this question comes up.  Whenever it

does, it’s obvious that the poster does not know what a hash is.

I think what you are thinking of is a red black tree (or just a 

binary
tree, in general) and not a hash…

It doesn’t seem very PoLS to have it reordered, although perhaps one
shouldn’t be surprised that a hash is unordered? Perhaps Matz is
convincing us of this statement? Said Matz unto the flock in a loud,
Godly voice: “Make no assumptions about the order of hashes!”

People invoke the PoLS way too often considering it's a claim that 

Matz
has never made.
Personally, the Principle of Least Surprise is more about
consistency
than it is about how often you’re literally surprised. Often, you’re
surprised because of your own ignorance (as in this instance), so you
can
hardly blame Ruby for that! Ruby is very consistent and, in my opinion,
follows the PoLS. Whenever it doesn’t, it ususally does so for a good
(pragmatic) reason…

Honestly, complaining that hashes aren't ordered is like complaining

that rand() doesn’t return the same number every time. Pray tell, what
made you think it should be ordered? That was an honest question, by
the
way! You know enough about programming to come here and decree that
this
is very weird yet you didn’t already know what a hash is or how one
works.
Very strange…

On 8/21/06, Bil K. [email protected] wrote:

I’ll try to look into it after I return from JPL next
week. However, if you’d like to throw down an example,
I might be able to work it in now…

Quick proof of concept:

require ‘enumerator’

class OHash
include Enumerable

def initialize
@a = []
@h = {}
end

def []=(k,v)
@a.delete(k) if @h[k]
@h[k] = v
@a << k
end

def delete(k, &blk)
@a.delete(k)
@h.delete(k)
end

def each
p @a, @h
each_key {|k| yield [k, @h[k]]}
end

def each_key
@a.each {|k| yield k}
end

def method_missing(*args)
@h.send(*args)
end
end

def o(*ary)
r = OHash.new
ary.each_slice(2) {|k,v| r[k] = v }
r
end

testing

a = o(“hello”, “world”, :foo, “bar”, 5, 6)
a.each {|k,v| p [k,v]}
puts a[“hello”]
a[“hello”] = 5
a.each {|k,v| p [k,v]}

martin