How to order a hash based on its keys?

luislavena · June 21, 2011, 4:35pm

Hi, I want to order a hash using itds keys:

irb> a = { 2=>“2”, 3=>“3”, 1=>“1” }
irb> a.sort
[[1, “1”], [2, “2”], [3, “3”]]

But I want a resulting hash with keys ordered:

{ 1=>“1”, 2=>“2”, 3=>“3”}

Is there any efficiente way? (I don’t want the hast to be converted to
an array and the to a hash again).

Thanks.

SSISSaki_Baz_C.SS_S · June 21, 2011, 4:46pm

On Tue, Jun 21, 2011 at 3:34 PM, Iñaki Baz C. [email protected]
wrote:

Hi, I want to order a hash using itds keys:

When you hear “order” and “hash” in the same sentence, you should
think “hashes aren’t ordered”.

1.9 keeps hashes initially in insertion order, but if you don’t create
your hash with the keys in the order you want, since Hash isn’t
ordered, I don’t think there’s a way to get what you want without
creating an Array and then a second Hash, with the keys in the order
you want.

SSISSaki_Baz_C.SS_S · June 21, 2011, 4:51pm

On Tue, Jun 21, 2011 at 8:04 PM, Iñaki Baz C. [email protected]
wrote:

Is there any efficiente way? (I don’t want the hast to be converted to
an array and the to a hash again).

Thanks.

–
Iñaki Baz C.
[email protected]

Atleast according to this StackOverflow question:

Hash[h.sort] seems to be the best way.

Vikhyat K.

SSISSaki_Baz_C.SS_S · June 21, 2011, 5:12pm

(Resending this b/c it looks like it got blocked or the server crashed
the
first time. Hopefully it’s not a duplicate)

On Tue, Jun 21, 2011 at 9:34 AM, Iaki Baz C. [email protected]
wrote:

Is there any efficiente way? (I don’t want the hast to be converted to
an array and the to a hash again).

Thanks.

–
Iaki Baz C.
[email protected]

You could create your own data structure which keeps the elements sorted
and
supports the hash interface, ie sorted_hash.rb · GitHub

I don’t know if the overhead of keeping them sorted (in this case, in a
BST)
is lower than the overhead of the Array. But if you are frequently
iterating
over it, I would expect it to pay off (ie you would have to create a new
array each time you iterate over a Hash, but you will not have to create
over a SortedHash. So probably depends on the use case.

SSISSaki_Baz_C.SS_S · June 21, 2011, 5:24pm

On Tue, Jun 21, 2011 at 4:34 PM, Iaki Baz C. [email protected]
wrote:

Is there any efficiente way? (I don’t want the hast to be converted to
an array and the to a hash again).

See Adam’s and Josh’s replies: a Hash is generally unordered.
However, what do you need this for? If it is for debugging purposes
then you might as well override #inspect on a per instance basis or
change it in Hash (not recommended). If you need that for other
reasons then maybe a tree might be a better choice. There is for
example
http://raa.ruby-lang.org/project/ruby-rbtree/

Kind regards

robert

SSISSaki_Baz_C.SS_S · June 21, 2011, 5:22pm

You could create your own data structure which keeps the elements sorted and
supports the hash interface, ie sorted_hash.rb · GitHub

Or maybe a rb tree (red-black tree) from the rbtree gem. Haven’t used
it myself though.

SSISSaki_Baz_C.SS_S · June 21, 2011, 6:08pm

though true a hash doesn’t maintain order, you can sort the keys and
use the result to do something… (though you couldn’t return an
ordered hash)

hashes_arent_ordered = { 2=>“2”, 3=>“3”, 1=>“1” }
hashes_arent_ordered.keys.sort.each {|key| puts “key #{key}, value
#{hashes_arent_ordered[key]}”}

-Benjamin, Chicago

SSISSaki_Baz_C.SS_S · June 21, 2011, 6:13pm

2011/6/21 Vikhyat K. [email protected]:

Hash[h.sort] seems to be the best way.

It adds a bit overhead (of course):

irb> h = {“a”=>1, “c”=>3, “b”=>2, “d”=>4}

irb> Benchmark.realtime { 100000.times { h.sort } }
0.787590503692627

irb> Benchmark.realtime { 100000.times { Hash[h.sort] } }
1.1099345684051514

But seems good enough

SSISSaki_Baz_C.SS_S · June 21, 2011, 6:15pm

On Tue, Jun 21, 2011 at 3:34 PM, Iñaki Baz C. [email protected]
wrote:

Is there any efficiente way? (I don’t want the hast to be converted to
an array and the to a hash again).

I guess you didn’t mind doing then?

SSISSaki_Baz_C.SS_S · June 21, 2011, 6:27pm

2011/6/21 Robert K. [email protected]:

See Adam’s and Josh’s replies: a Hash is generally unordered.
However, what do you need this for? If it is for debugging purposes
then you might as well override #inspect on a per instance basis or
change it in Hash (not recommended).

Hi, I don’t understand your comment.

SSISSaki_Baz_C.SS_S · June 21, 2011, 6:17pm

2011/6/21 Adam P. [email protected]:

On Tue, Jun 21, 2011 at 3:34 PM, Iñaki Baz C. [email protected] wrote:

Is there any efficiente way? (I don’t want the hast to be converted to
an array and the to a hash again).

I guess you didn’t mind doing then?

Well, I wanted to avoid doing more stuff in Ruby, as using
Array#map/inject to generate a new Hash and so. But maybe using Hash[]
method directly is faster.

SSISSaki_Baz_C.SS_S · June 21, 2011, 6:34pm

2011/6/21 Iñaki Baz C. [email protected]:

2011/6/21 Robert K. [email protected]:

See Adam’s and Josh’s replies: a Hash is generally unordered.
However, what do you need this for? If it is for debugging purposes
then you might as well override #inspect on a per instance basis or
change it in Hash (not recommended).

Hi, I don’t understand your comment.

Sorry, my dog pressed “sent mail”.

I meant that I don’t understand the point of your comments. In Ruby
1.9 Hashes are ordered. I cannot change the order after creating them,
but if you inspect elements of a Hash you get them in the order in
which they were inserted. This is valid and useful for me (in my
case).

If you need that for other
reasons then maybe a tree might be a better choice. There is for
example
http://raa.ruby-lang.org/project/ruby-rbtree/

I already worker with rbtree. The only difference (or one of them) is
the way in which keys are inserted. rbtree requires all the keys being
the same class. In my tests rbtree is better for deleting elements but
worse than a Hash for inserting them.

You can check it:

require “benchmark”

@hash = {}
@rbtree = RBTree.new

TIMES = ARGV[0] ? ARGV[0].to_i : 10000
WORD = “z9hG4bK”.freeze

def gen_word(word, n)
case word
when :fixed_begin
WORD + n.to_s
when :dynamic_begin
n.to_s + WORD
end
end

def test(word)
puts
case word
when :fixed_begin
puts “Using word: "z9hG4bK" + n.to_s”
when :dynamic_begin
puts “Using word: n.to_s + "z9hG4bK"”
end

puts
printf "- Hash insertion: "
puts time_hash_insertion = Benchmark.realtime {
TIMES.times do |n|
@hash[gen_word(word, n)] = 12345
end
}

printf "- RBTree insertion: "
puts time_rbtree_insertion = Benchmark.realtime {
TIMES.times do |n|
@rbtree[gen_word(word, n)] = 12345
end
}

puts
printf "- Hash deletion: "
puts time_hash_deletion = Benchmark.realtime {
TIMES.times do |n|
@hash.delete(gen_word(word, n))
end
}

printf "- RBTree deletion: "
puts time_rbtree_deletion = Benchmark.realtime {
TIMES.times do |n|
@rbtree.delete(gen_word(word, n))
end
}

puts
puts “TOTAL:”
puts “- Hash insertion + deletion: #{time_hash_insertion +
time_hash_deletion}”
puts “- RBTree insertion + deletion: #{time_rbtree_insertion +
time_rbtree_deletion}”
end

puts “Entries: #{TIMES}”
puts
test(:fixed_begin)
puts
test(:dynamic_begin)

Results:

Entries: 10000

Using word: “z9hG4bK” + n.to_s

Hash insertion (seconds): 0.020041227340698242
RBTree insertion (seconds): 0.041683197021484375
Hash deletion (seconds): 0.035521745681762695
RBTree deletion (seconds): 0.023290157318115234

TOTAL:

Hash insertion + deletion (seconds): 0.05556297302246094
RBTree insertion + deletion (seconds): 0.06497335433959961

Using word: n.to_s + “z9hG4bK”

Hash insertion (seconds): 0.019947528839111328
RBTree insertion (seconds): 0.0295107364654541
Hash deletion (seconds): 0.029720067977905273
RBTree deletion (seconds): 0.01738262176513672

TOTAL:

Hash insertion + deletion (seconds): 0.0496675968170166
RBTree insertion + deletion (seconds): 0.04689335823059082

SSISSaki_Baz_C.SS_S · June 22, 2011, 8:48am

On Tue, Jun 21, 2011 at 6:34 PM, Iaki Baz C. [email protected]
wrote:

2011/6/21 Iaki Baz C. [email protected]:

2011/6/21 Robert K. [email protected]:

See Adam’s and Josh’s replies: a Hash is generally unordered.
However, what do you need this for? If it is for debugging purposes
then you might as well override #inspect on a per instance basis or
change it in Hash (not recommended).

Hi, I don’t understand your comment.

Sorry, my dog pressed “sent mail”.

LOL That’s what our cat didn’t manage yet - although he is trying hard
by walking over the keyboard if I am not watchful enough.

I meant that I don’t understand the point of your comments. In Ruby
1.9 Hashes are ordered. I cannot change the order after creating them,
but if you inspect elements of a Hash you get them in the order in
which they were inserted. This is valid and useful for me (in my
case).

I thought you want a different ordering (based on key).

If you need that for other
reasons then maybe a tree might be a better choice. There is for
example
http://raa.ruby-lang.org/project/ruby-rbtree/

I already worker with rbtree. The only difference (or one of them) is
the way in which keys are inserted. rbtree requires all the keys being
the same class. In my tests rbtree is better for deleting elements but
worse than a Hash for inserting them.

benchmark

Now I am confused. What do you need? A fast data structure? An
ordered data structure? What is more important? You will need to do
a trade off here. What are your requirements anyway?

Kind regards

robert

SSISSaki_Baz_C.SS_S · June 22, 2011, 11:46am

2011/6/22 Anurag P. [email protected]:

the same class. In my tests rbtree is better for deleting elements but
worse than a Hash for inserting them.

Since we are talking about performance benchmarks too, Ruby 1.9’s
ordered hashes also exhibit faster reads than writes.

My benchmark was tested using Ruby 1.9

SSISSaki_Baz_C.SS_S · June 22, 2011, 9:27am

the same class. In my tests rbtree is better for deleting elements but
worse than a Hash for inserting them.

Since we are talking about performance benchmarks too, Ruby 1.9’s
ordered hashes also exhibit faster reads than writes.

SSISSaki_Baz_C.SS_S · June 22, 2011, 11:54am

2011/6/22 Robert K. [email protected]:

I already worker with rbtree. The only difference (or one of them) is
the way in which keys are inserted. rbtree requires all the keys being
the same class. In my tests rbtree is better for deleting elements but
worse than a Hash for inserting them.

benchmark

Now I am confused. What do you need? A fast data structure? An
ordered data structure? What is more important? You will need to do
a trade off here. What are your requirements anyway?

Ok. I get some data and convert it into a Hash whose key is a priority
value (integer, 0 is the best priority). The value of each entry is of
course the data value associated with such priority. But such hash has
not been ordered, this is, probably entry 1 has priority 2 while entre
2 has priority 1 (or whatever).

Then I just want to get a new hash in which entries are ordered by
priority. I also want to avoid too much Ruby stuff using “hast →
array → Array#map(…) → new has”. In this thread it has been
suggested using:

ordered_hash = Hash[original_hash.sort]

and indeed it works. Yes, it is doing hash->array->hash, but being at
C level I expect it’s fast enough.

Regards.

SSISSaki_Baz_C.SS_S · June 22, 2011, 12:56pm

2011/6/22 Phillip G. [email protected]:

I suspect you want to output this hash ordered by priority, correct?

Not exactly. Well, not at all in fact

Being more explicit, I use em-udns library ang perform a DNS SRV query
for a domain, and get an array of one1 or more
EventMachine::Udns::RR_SRV instances. Each object has a priority value
(and also weight, but I don’t care it now):

GitHub - ibc/em-udns: An async DNS resolver for EventMachine based on the udns C library

Note that priorities don’t need to be correlated, this is, there could
be 4 entries with these priorities:

1 => A
10 => B
1 => C
3 => D

I just want to get an ordered hash in which:

The key is the priority (in the given example there would be 3
entries: 1, 3 and 10).
The value is an array with all the valus with such priority.

So, what I want to get is:

{
1 => [A, C],
3 => [D],
10 => [B]
}

What I do know is:

Get the DNS result array:

srvs = [ object1, object2, object3, object4 ]
Order the array based on Object#priority.

srvs.sort_by{|e| e.priority}
Construct the hash:

ordered_hash = {}
srvs.sort_by{|e| e.priority}.each do |srv|
ordered_hash[srv.priority| ||= []
ordered_hash[srv.priority| << srv.value
end

It just works. Initially I tryed a different approach so my original
question was a bit different.

SSISSaki_Baz_C.SS_S · June 22, 2011, 1:05pm

On Wed, Jun 22, 2011 at 12:54 PM, Iaki Baz C. [email protected]
wrote:

2011/6/22 Phillip G. [email protected]:

I suspect you want to output this hash ordered by priority, correct?

Not exactly. Well, not at all in fact
[snip explanation & design constraints]

Ah, now I see. And while I still don’t think that an ordered hash is
strictly necessary, I see why you’d want one.

–
Phillip G.

A method of solution is perfect if we can forsee from the start,
and even prove, that following that method we shall attain our aim.
– Leibnitz

SSISSaki_Baz_C.SS_S · June 22, 2011, 12:35pm

On Wed, Jun 22, 2011 at 11:50 AM, Iaki Baz C. [email protected]
wrote:

Ok. I get some data and convert it into a Hash whose key is a priority
value (integer, 0 is the best priority). The value of each entry is of
course the data value associated with such priority. But such hash has
not been ordered, this is, probably entry 1 has priority 2 while entre
2 has priority 1 (or whatever).

Then I just want to get a new hash in which entries are ordered by
priority.

I suspect you want to output this hash ordered by priority, correct?

Then why not leave the sorting and such to the emitter of the output?

Like so:

PRIORITIES = [0,1,2,3,4] # 5 priorities are enough for everyone

items_to_do = {0 => ‘Do it right now!’, 4 => ‘Sometime this decade’, 1
=> ‘This
can wait.’}

PRIORITIES.each do |prio|
puts “#{prio}: #{items_to_do[prio]}” unless items_to_do[prio].nil? #
Suppress output of “empty” priorities
end

It’s ordered output, and you don’t have to sort the hash.

Mind, I’m curious about the why, and not saying you are wrong, or that
my way is better.

–
Phillip G.

A method of solution is perfect if we can forsee from the start,
and even prove, that following that method we shall attain our aim.
– Leibnitz

SSISSaki_Baz_C.SS_S · June 22, 2011, 3:48pm

2011/6/22 Robert K. [email protected]:

prio_data[1] = “foo”
prio_data[4] = “bar”

prio_data.each_pair {|prio,v| printf “key=%p val=%p\n”, prio, v}

Interesting approach

Then I just want to get a new hash in which entries are ordered by
priority.

Why do you need a second container object? Is this just for sorting
purposes or do you actually need to maintain the state from a given
point in time?

The second container (an ordered one) will be passed (later) to a
method which randomizes each DNS result based on its SRV weight. So
first I need a contained in which DNS records are ordered by priority,
and later a randomize method (already done) which shorts the array
entries of each priority value based on entries weight (randomly of
course, but taking into account the weight of each SRV record).

I’ve it already working, but I will take a more deepth look to your
above code to integrate it.

Thanks a lot.

How to order a hash based on its keys?

puts “Entries: #{TIMES}” puts test(:fixed_begin) puts test(:dynamic_begin)

puts “Entries: #{TIMES}”
puts
test(:fixed_begin)
puts
test(:dynamic_begin)