How to order a hash based on its keys?

Hi, I want to order a hash using itds keys:

irb> a = { 2=>“2”, 3=>“3”, 1=>“1” }
irb> a.sort
[[1, “1”], [2, “2”], [3, “3”]]

But I want a resulting hash with keys ordered:

{ 1=>“1”, 2=>“2”, 3=>“3”}

Is there any efficiente way? (I don’t want the hast to be converted to
an array and the to a hash again).

Thanks.

On Tue, Jun 21, 2011 at 3:34 PM, Iñaki Baz C. [email protected]
wrote:

Hi, I want to order a hash using itds keys:

When you hear “order” and “hash” in the same sentence, you should
think “hashes aren’t ordered”.

1.9 keeps hashes initially in insertion order, but if you don’t create
your hash with the keys in the order you want, since Hash isn’t
ordered, I don’t think there’s a way to get what you want without
creating an Array and then a second Hash, with the keys in the order
you want.

On Tue, Jun 21, 2011 at 8:04 PM, Iñaki Baz C. [email protected]
wrote:

Is there any efficiente way? (I don’t want the hast to be converted to
an array and the to a hash again).

Thanks.


Iñaki Baz C.
[email protected]

Atleast according to this StackOverflow question:

Hash[h.sort] seems to be the best way.

Vikhyat K.

(Resending this b/c it looks like it got blocked or the server crashed
the
first time. Hopefully it’s not a duplicate)

On Tue, Jun 21, 2011 at 9:34 AM, Iaki Baz C. [email protected]
wrote:

Is there any efficiente way? (I don’t want the hast to be converted to
an array and the to a hash again).

Thanks.


Iaki Baz C.
[email protected]

You could create your own data structure which keeps the elements sorted
and
supports the hash interface, ie sorted_hash.rb · GitHub

I don’t know if the overhead of keeping them sorted (in this case, in a
BST)
is lower than the overhead of the Array. But if you are frequently
iterating
over it, I would expect it to pay off (ie you would have to create a new
array each time you iterate over a Hash, but you will not have to create
over a SortedHash. So probably depends on the use case.

On Tue, Jun 21, 2011 at 4:34 PM, Iaki Baz C. [email protected]
wrote:

Is there any efficiente way? (I don’t want the hast to be converted to
an array and the to a hash again).

See Adam’s and Josh’s replies: a Hash is generally unordered.
However, what do you need this for? If it is for debugging purposes
then you might as well override #inspect on a per instance basis or
change it in Hash (not recommended). If you need that for other
reasons then maybe a tree might be a better choice. There is for
example
http://raa.ruby-lang.org/project/ruby-rbtree/

Kind regards

robert

You could create your own data structure which keeps the elements sorted and
supports the hash interface, ie sorted_hash.rb · GitHub

Or maybe a rb tree (red-black tree) from the rbtree gem. Haven’t used
it myself though.

though true a hash doesn’t maintain order, you can sort the keys and
use the result to do something… (though you couldn’t return an
ordered hash)

hashes_arent_ordered = { 2=>“2”, 3=>“3”, 1=>“1” }
hashes_arent_ordered.keys.sort.each {|key| puts “key #{key}, value
#{hashes_arent_ordered[key]}”}

-Benjamin, Chicago

2011/6/21 Vikhyat K. [email protected]:

Hash[h.sort] seems to be the best way.

It adds a bit overhead (of course):

irb> h = {“a”=>1, “c”=>3, “b”=>2, “d”=>4}

irb> Benchmark.realtime { 100000.times { h.sort } }
0.787590503692627

irb> Benchmark.realtime { 100000.times { Hash[h.sort] } }
1.1099345684051514

But seems good enough :slight_smile:

On Tue, Jun 21, 2011 at 3:34 PM, Iñaki Baz C. [email protected]
wrote:

Is there any efficiente way? (I don’t want the hast to be converted to
an array and the to a hash again).

I guess you didn’t mind doing then? :slight_smile:

2011/6/21 Robert K. [email protected]:

See Adam’s and Josh’s replies: a Hash is generally unordered.
However, what do you need this for? If it is for debugging purposes
then you might as well override #inspect on a per instance basis or
change it in Hash (not recommended).

Hi, I don’t understand your comment.

2011/6/21 Adam P. [email protected]:

On Tue, Jun 21, 2011 at 3:34 PM, Iñaki Baz C. [email protected] wrote:

Is there any efficiente way? (I don’t want the hast to be converted to
an array and the to a hash again).

I guess you didn’t mind doing then? :slight_smile:

Well, I wanted to avoid doing more stuff in Ruby, as using
Array#map/inject to generate a new Hash and so. But maybe using Hash[]
method directly is faster.

2011/6/21 Iñaki Baz C. [email protected]:

2011/6/21 Robert K. [email protected]:

See Adam’s and Josh’s replies: a Hash is generally unordered.
However, what do you need this for? If it is for debugging purposes
then you might as well override #inspect on a per instance basis or
change it in Hash (not recommended).

Hi, I don’t understand your comment.

Sorry, my dog pressed “sent mail”.

I meant that I don’t understand the point of your comments. In Ruby
1.9 Hashes are ordered. I cannot change the order after creating them,
but if you inspect elements of a Hash you get them in the order in
which they were inserted. This is valid and useful for me (in my
case).

If you need that for other
reasons then maybe a tree might be a better choice. There is for
example
http://raa.ruby-lang.org/project/ruby-rbtree/

I already worker with rbtree. The only difference (or one of them) is
the way in which keys are inserted. rbtree requires all the keys being
the same class. In my tests rbtree is better for deleting elements but
worse than a Hash for inserting them.

You can check it:


require “benchmark”

@hash = {}
@rbtree = RBTree.new

TIMES = ARGV[0] ? ARGV[0].to_i : 10000
WORD = “z9hG4bK”.freeze

def gen_word(word, n)
case word
when :fixed_begin
WORD + n.to_s
when :dynamic_begin
n.to_s + WORD
end
end

def test(word)
puts
case word
when :fixed_begin
puts “Using word: "z9hG4bK" + n.to_s”
when :dynamic_begin
puts “Using word: n.to_s + "z9hG4bK"”
end

puts
printf "- Hash insertion: "
puts time_hash_insertion = Benchmark.realtime {
TIMES.times do |n|
@hash[gen_word(word, n)] = 12345
end
}

printf "- RBTree insertion: "
puts time_rbtree_insertion = Benchmark.realtime {
TIMES.times do |n|
@rbtree[gen_word(word, n)] = 12345
end
}

puts
printf "- Hash deletion: "
puts time_hash_deletion = Benchmark.realtime {
TIMES.times do |n|
@hash.delete(gen_word(word, n))
end
}

printf "- RBTree deletion: "
puts time_rbtree_deletion = Benchmark.realtime {
TIMES.times do |n|
@rbtree.delete(gen_word(word, n))
end
}

puts
puts “TOTAL:”
puts “- Hash insertion + deletion: #{time_hash_insertion +
time_hash_deletion}”
puts “- RBTree insertion + deletion: #{time_rbtree_insertion +
time_rbtree_deletion}”
end

puts “Entries: #{TIMES}”
puts
test(:fixed_begin)
puts
test(:dynamic_begin)

Results:


Entries: 10000

Using word: “z9hG4bK” + n.to_s

  • Hash insertion (seconds): 0.020041227340698242

  • RBTree insertion (seconds): 0.041683197021484375

  • Hash deletion (seconds): 0.035521745681762695

  • RBTree deletion (seconds): 0.023290157318115234

TOTAL:

  • Hash insertion + deletion (seconds): 0.05556297302246094
  • RBTree insertion + deletion (seconds): 0.06497335433959961

Using word: n.to_s + “z9hG4bK”

  • Hash insertion (seconds): 0.019947528839111328

  • RBTree insertion (seconds): 0.0295107364654541

  • Hash deletion (seconds): 0.029720067977905273

  • RBTree deletion (seconds): 0.01738262176513672

TOTAL:

  • Hash insertion + deletion (seconds): 0.0496675968170166
  • RBTree insertion + deletion (seconds): 0.04689335823059082

On Tue, Jun 21, 2011 at 6:34 PM, Iaki Baz C. [email protected]
wrote:

2011/6/21 Iaki Baz C. [email protected]:

2011/6/21 Robert K. [email protected]:

See Adam’s and Josh’s replies: a Hash is generally unordered.
However, what do you need this for? If it is for debugging purposes
then you might as well override #inspect on a per instance basis or
change it in Hash (not recommended).

Hi, I don’t understand your comment.

Sorry, my dog pressed “sent mail”.

LOL That’s what our cat didn’t manage yet - although he is trying hard
by walking over the keyboard if I am not watchful enough.

I meant that I don’t understand the point of your comments. In Ruby
1.9 Hashes are ordered. I cannot change the order after creating them,
but if you inspect elements of a Hash you get them in the order in
which they were inserted. This is valid and useful for me (in my
case).

I thought you want a different ordering (based on key).

If you need that for other
reasons then maybe a tree might be a better choice. There is for
example
http://raa.ruby-lang.org/project/ruby-rbtree/

I already worker with rbtree. The only difference (or one of them) is
the way in which keys are inserted. rbtree requires all the keys being
the same class. In my tests rbtree is better for deleting elements but
worse than a Hash for inserting them.

benchmark

Now I am confused. What do you need? A fast data structure? An
ordered data structure? What is more important? You will need to do
a trade off here. What are your requirements anyway?

Kind regards

robert

2011/6/22 Anurag P. [email protected]:

the same class. In my tests rbtree is better for deleting elements but
worse than a Hash for inserting them.

Since we are talking about performance benchmarks too, Ruby 1.9’s
ordered hashes also exhibit faster reads than writes.

My benchmark was tested using Ruby 1.9 :wink:

the same class. In my tests rbtree is better for deleting elements but
worse than a Hash for inserting them.

Since we are talking about performance benchmarks too, Ruby 1.9’s
ordered hashes also exhibit faster reads than writes.

2011/6/22 Robert K. [email protected]:

I already worker with rbtree. The only difference (or one of them) is
the way in which keys are inserted. rbtree requires all the keys being
the same class. In my tests rbtree is better for deleting elements but
worse than a Hash for inserting them.

benchmark

Now I am confused. What do you need? A fast data structure? An
ordered data structure? What is more important? You will need to do
a trade off here. What are your requirements anyway?

Ok. I get some data and convert it into a Hash whose key is a priority
value (integer, 0 is the best priority). The value of each entry is of
course the data value associated with such priority. But such hash has
not been ordered, this is, probably entry 1 has priority 2 while entre
2 has priority 1 (or whatever).

Then I just want to get a new hash in which entries are ordered by
priority. I also want to avoid too much Ruby stuff using “hast →
array → Array#map(…) → new has”. In this thread it has been
suggested using:

ordered_hash = Hash[original_hash.sort]

and indeed it works. Yes, it is doing hash->array->hash, but being at
C level I expect it’s fast enough.

Regards.

2011/6/22 Phillip G. [email protected]:

I suspect you want to output this hash ordered by priority, correct?

Not exactly. Well, not at all in fact :slight_smile:

Being more explicit, I use em-udns library ang perform a DNS SRV query
for a domain, and get an array of one1 or more
EventMachine::Udns::RR_SRV instances. Each object has a priority value
(and also weight, but I don’t care it now):

GitHub - ibc/em-udns: An async DNS resolver for EventMachine based on the udns C library

Note that priorities don’t need to be correlated, this is, there could
be 4 entries with these priorities:

  • 1 => A
  • 10 => B
  • 1 => C
  • 3 => D

I just want to get an ordered hash in which:

  • The key is the priority (in the given example there would be 3
    entries: 1, 3 and 10).
  • The value is an array with all the valus with such priority.

So, what I want to get is:

{
1 => [A, C],
3 => [D],
10 => [B]
}

What I do know is:

  • Get the DNS result array:

    srvs = [ object1, object2, object3, object4 ]

  • Order the array based on Object#priority.

    srvs.sort_by{|e| e.priority}

  • Construct the hash:

    ordered_hash = {}
    srvs.sort_by{|e| e.priority}.each do |srv|
    ordered_hash[srv.priority| ||= []
    ordered_hash[srv.priority| << srv.value
    end

It just works. Initially I tryed a different approach so my original
question was a bit different.

On Wed, Jun 22, 2011 at 12:54 PM, Iaki Baz C. [email protected]
wrote:

2011/6/22 Phillip G. [email protected]:

I suspect you want to output this hash ordered by priority, correct?

Not exactly. Well, not at all in fact :slight_smile:
[snip explanation & design constraints]

Ah, now I see. And while I still don’t think that an ordered hash is
strictly necessary, I see why you’d want one. :slight_smile:


Phillip G.

A method of solution is perfect if we can forsee from the start,
and even prove, that following that method we shall attain our aim.
– Leibnitz

On Wed, Jun 22, 2011 at 11:50 AM, Iaki Baz C. [email protected]
wrote:

Ok. I get some data and convert it into a Hash whose key is a priority
value (integer, 0 is the best priority). The value of each entry is of
course the data value associated with such priority. But such hash has
not been ordered, this is, probably entry 1 has priority 2 while entre
2 has priority 1 (or whatever).

Then I just want to get a new hash in which entries are ordered by
priority.

I suspect you want to output this hash ordered by priority, correct?

Then why not leave the sorting and such to the emitter of the output?

Like so:

PRIORITIES = [0,1,2,3,4] # 5 priorities are enough for everyone :wink:

items_to_do = {0 => ‘Do it right now!’, 4 => ‘Sometime this decade’, 1
=> ‘This
can wait.’}

PRIORITIES.each do |prio|
puts “#{prio}: #{items_to_do[prio]}” unless items_to_do[prio].nil? #
Suppress output of “empty” priorities
end

It’s ordered output, and you don’t have to sort the hash.

Mind, I’m curious about the why, and not saying you are wrong, or that
my way is better. :slight_smile:


Phillip G.

A method of solution is perfect if we can forsee from the start,
and even prove, that following that method we shall attain our aim.
– Leibnitz

2011/6/22 Robert K. [email protected]:

prio_data[1] = “foo”
prio_data[4] = “bar”

prio_data.each_pair {|prio,v| printf “key=%p val=%p\n”, prio, v}

Interesting approach :slight_smile:

Then I just want to get a new hash in which entries are ordered by
priority.

Why do you need a second container object? Is this just for sorting
purposes or do you actually need to maintain the state from a given
point in time?

The second container (an ordered one) will be passed (later) to a
method which randomizes each DNS result based on its SRV weight. So
first I need a contained in which DNS records are ordered by priority,
and later a randomize method (already done) which shorts the array
entries of each priority value based on entries weight (randomly of
course, but taking into account the weight of each SRV record).

I’ve it already working, but I will take a more deepth look to your
above code to integrate it.

Thanks a lot.