A use case for an ordered hash

Hal_F · August 15, 2006, 2:19pm

Phillip H. wrote:

Anyway: I’m happy that lookups are fast. But 99% of the time,
my hashes are small enough that the time savings is probably
very small in terms of program execution time.

Unfortunately for you some of us do things that involve searching
through 30,000 sales records to pull out Joe Bob’s statistics.
Sometimes you just don’t want to do that on the production database,
it just uses far too many resources. Using an ordered hash for one
person’s benefit would kill the speed and jack up memory usage.

Your point is well taken.

However, it remains to be seen whether it would kill the speed.
Personally I doubt it would be significant, but I’ve seen no
numbers one way or the other.

Memory I would consider a more serious issue. Every pair in the
hash would occupy an extra N bytes (addmitedly a low value of N,
like 4). In the case of 30,000 pairs, that’s an additional (say)
120K of RAM used.

As for “one person’s benefit” – it’s true that the people
wanting this functionality are in the minority. But we are
far more than one person.

I know an ordered associative array (which I think is a better
description of the functionality) would make some things very easy,
but replacing the Hash is not the way to go.

That is definitely a better description of the finctionality.

I would even say that “associative array” alone would be a sufficient
term, since “array” implies order. (But I may be wrong there.)

I’d be happy with a separate class for that functionality IF there
were a convenient notation for literals. (For example, the combination
of square brackets and arrows that I mentioned previously, which Ruby
currently interprets as an array with a single element which is a hash.)

Without the literal notation, it would be to me just another kludge.

Hal

Hal_F · August 15, 2006, 2:19pm

Hal F. wrote:

I would even say that “associative array” alone would be a sufficient
term, since “array” implies order. (But I may be wrong there.)

I’d be happy with a separate class for that functionality IF there
were a convenient notation for literals. (For example, the combination
of square brackets and arrows that I mentioned previously, which Ruby
currently interprets as an array with a single element which is a hash.)

Without the literal notation, it would be to me just another kludge.

+1 for Hal

Also, while not a full fledged class in it’s own right, you might find
Facets’ association.rb useful. Ex.

c = [ :a >> 1, :b >> 2 ]
c.each { |k,v| puts "#{k} associated with #{v} }

T.

Hal_F · August 15, 2006, 2:19pm

On Mon, Aug 14, 2006 at 06:17:34AM +0900, John C. wrote:

I’m in the habit of automagically writing…

hash.keys.sort

. . . but if you want them in a specific order (other than that
provided by .sort), you’ll have to write your own method, of course.

Hal_F · August 15, 2006, 2:19pm

I’d be happy with a separate class for that functionality IF there
were a convenient notation for literals. (For example, the combination
of square brackets and arrows that I mentioned previously, which Ruby
currently interprets as an array with a single element which is a hash.)

Without the literal notation, it would be to me just another kludge.

I would support this as there are uses for it.

Hal_F · August 15, 2006, 2:19pm

Trans wrote:

Also, while not a full fledged class in it’s own right, you might find
Facets’ association.rb useful. Ex.

c = [ :a >> 1, :b >> 2 ]
c.each { |k,v| puts "#{k} associated with #{v} }

I’ve seen that, and it’s kind of cool.

But you can’t use an object as a key if it already defines >> for
something else (e.g. Fixnum).

Hal

Hal_F · August 15, 2006, 2:19pm

On Aug 13, 2006, at 7:19 PM, Hal F. wrote:

something else (e.g. Fixnum).

Hal

This is probably too verbose, even for a stop gap solution but:

class AssocList
def initialize(first, second)
@array = [ [ first.keys.first, first.values.first ],
[ second.keys.first, second.values.first ] ]
end

def <(other)
@array << [ other.keys.first, other.values.first ]
self
end
end

class Hash
def <(other)
AssocList.new(self, other)
end
end

{:a=>1} < {:b=>2} < {:c=>3}
=> #<AssocList:0x328564 @array=[[:a, 1], [:b, 2], [:c, 3]]>

I was going for f(:a => 1 < :b => 2) but the operator precedence
isn’t right

Hal_F · August 15, 2006, 2:19pm

On 8/13/06, John C. [email protected] wrote:

I’m in the habit of automagically writing…

hash.keys.sort

or sometimes

hash.keys.sort_by{|k| hash[k]}.each{|k| value=hash[k]; … }

In the case that Hal is talking about, and in the case where
PDF::Writer uses an ordered hash, this would be inappropriate.
Insertion-ordered associations are generally better. I will probably
be modifying PDF::Writer a bit to make it clear that it’s using an
ordered association, not an ordered “hash”, but it would be useful to
have an ordered association. Maybe we could take advantage of a
literal for this:

ordered_assoc = ( 1 => 2, 2 => 3, ‘a’ => 5 )

Maybe. For Ruby 2.0.

-austin

Hal_F · August 15, 2006, 2:19pm

Austin Z. wrote:

Maybe. For Ruby 2.0.

Either (=>) or [=>] would be fine with me. I’d slightly prefer
the latter.

But the big question is, what would the class be called? I dislike
long names such as Association or AssociativeArray or even
OrderedHash. (I suppose the last might be best of those.)

I once suggested Map – a nice short name – but there was some
reason this wasn’t acceptable.

I’d even suggest (gasp) letting it inherit from Hash.

Hal

Hal_F · August 15, 2006, 2:19pm

On 8/14/06, Hal F. [email protected] wrote:

Either (=>) or [=>] would be fine with me. I’d slightly prefer
the latter.

Or we could use another %?{ } constructor, which are, after all, being
reserved for future expansion.

But the big question is, what would the class be called? I dislike
long names such as Association or AssociativeArray or even
OrderedHash. (I suppose the last might be best of those.)

OHash or Assoc or even Dict (though the latter at least implies an O(1)
lookup).

Also, it would probably include RCR 306: include rbtree in the stdlib
along the way

m.

Hal_F · August 15, 2006, 2:19pm

On 8/14/06, Austin Z. [email protected] wrote:

On 8/14/06, Martin DeMello [email protected] wrote:

OHash or Assoc or even Dict (though the latter at least implies an O(1) lookup).

Wouldn’t it be the same as Hash lookup? It’s still got every other
characteristic of a Hash; it just guarantees an order that defaults to
insertion order.

True, I suppose you could do it with a doubly linked list overlaid on
a hash. Was thinking of sorted order.

martin

Hal_F · August 15, 2006, 2:19pm

On 8/14/06, Martin DeMello [email protected] wrote:

OHash or Assoc or even Dict (though the latter at least implies an O(1) lookup).

Wouldn’t it be the same as Hash lookup? It’s still got every other
characteristic of a Hash; it just guarantees an order that defaults to
insertion order.

-austin

Hal_F · August 15, 2006, 2:19pm

On 8/14/06, Austin Z. [email protected] wrote:

On 8/14/06, Martin DeMello [email protected] wrote:

OHash or Assoc or even Dict (though the latter at least implies an O(1)
lookup).

Wouldn’t it be the same as Hash lookup? It’s still got every other
characteristic of a Hash; it just guarantees an order that defaults to
insertion order.

That too is a point that needs definition. I’d like to have two
behaviors.
I call the first one static, and the second one dynamic (and I describe
the
second one first of course).
Than I would like to be able to write, I do not need literals, but that
is
another story that shall be told another day.

dyn = D.new( :a => 42, :b => 42, :c => 42 ) #using future named params
of
course
dyn[:b] = 42
puts dyn.keys.join(“") = > "acb"
while the static bersion would of course result in => "ab*c”
Even the static version has no memory of course and
static[k]=static.delete(k) would become a very meaningful statement,
suddenly

Cheers
Robert

BTW
Do you think one(*42) can generalize Hash values to 42? I think that
would
be a nice Set(up).

R.

-austin

Austin Z. * [email protected] * http://www.halostatue.ca/
* [email protected] * You are in a maze of twisty little passages, all alike. // halo • statue
* [email protected]

–
Deux choses sont infinies : l’univers et la bÃªtise humaine ; en ce qui
concerne l’univers, je n’en ai pas acquis la certitude absolue.

Albert Einstein

Hal_F · August 15, 2006, 2:19pm

On Aug 14, 2006, at 12:15 PM, Martin DeMello wrote:

True, I suppose you could do it with a doubly linked list overlaid on
a hash. Was thinking of sorted order.

Can’t you just maintain an internal array representing the key insertion
order? Something like:

class OHash < Hash

…

def each
keyseq.each { |k| yield k, self[k] }
end
end

Where keyseq is just an array that is updated whenever keys are
added/deleted?

Gary W.

Hal_F · August 15, 2006, 2:19pm

On 8/14/06, [email protected] [email protected] wrote:

Where keyseq is just an array that is updated whenever keys are
added/deleted?

It’s a little more complex than that, but that’s pretty much what all
existing implementations do. We’re talking about wanting a way of
having an ordered associative list (that defaults to insert ordering,
but can be changed to a sorted ordering, perhaps) in the Ruby core
language with a literal constructor, because there is value for it.

-austin

Hal_F · August 15, 2006, 2:20pm

On Aug 14, 2006, at 1:01 PM, Austin Z. wrote:

We’re talking about wanting a way of
having an ordered associative list (that defaults to insert ordering,
but can be changed to a sorted ordering, perhaps) in the Ruby core
language with a literal constructor, because there is value for it.

Yep, I understand the need for the literal constructor. It is a bit
awkward to do:

OHash.new [ [1,2], [3,4], [4,5] ]

or something similar. Random ideas:

%o{ 1 => 2, 3 => 4 }
%o{ 1,2,3,4 }
{1,2,3,4}! # the ! indicates that order is important!
!{1,2,3,4}
%!{1,2,3,4}

Gary W.

Hal_F · August 15, 2006, 2:20pm

Hal F. wrote:

There have been numerous occasions when I wanted an
ordered hash, but usually I can’t remember to write
them down.

Here’s my latest one[1]:

I had a string with a bunch of tagged “tolerance fields”,
e.g., >>1+/-10% ‘tag’<< (a mini DSL?). Based on some sort
of distribution function, e.g., uniform or Gaussian,
I generated an array of samples for each tagged field.
A regular hash was a perfect fit,

samples[tag] = [ sample1, sample2, … ]

until another requirement came up: to present the tags
in order of appearance.

Later,

Bil

[1] Part of a CEV (Crew Exploration Vehicle) task

Hal_F · August 15, 2006, 2:20pm

On 8/14/06, Bil K. [email protected] wrote:

until another requirement came up: to present the tags
in order of appearance.

I’ve had to deal with this (interestingly, in something that was
intended as a DSL). Just add another field to the structure you
store in the hash and have it increment a global sequence number when
you create it. A little clunky but later you can say:

hash.values.sort {|a,b| a.seq <=> b.seq}.each {|v|

whatever

}

Hal_F · August 15, 2006, 2:20pm

“Martin DeMello” [email protected] writes:

OrderedHash. (I suppose the last might be best of those.)

OHash or Assoc or even Dict (though the latter at least implies an O(1) lookup).

Fun history fact:

#524<5|>lilith:~/src$ grep Dict ruby-0.49/dict.c
…
C_Dict = rb_define_class(“Dict”, C_Object);
rb_name_class(C_Dict, rb_intern(“Hash”)); /* alias */
…

Hal_F · August 15, 2006, 2:20pm

Bil K. wrote:

until another requirement came up: to present the tags
in order of appearance.

Later,

Bil
http://fun3d.larc.nasa.gov

[1] Part of a CEV (Crew Exploration Vehicle) task

See, guys? An ordered hash in Ruby would help support
human space exploration!

/me ponders and RCR and a Dyson sphere design

Hal

Hal_F · August 15, 2006, 2:20pm

Austin Z. wrote:

On 8/14/06, Martin DeMello [email protected] wrote:

OHash or Assoc or even Dict (though the latter at least implies an
O(1) lookup).

Wouldn’t it be the same as Hash lookup? It’s still got every other
characteristic of a Hash; it just guarantees an order that defaults to
insertion order.

OHash lookup should be as fast as Hash lookup.

Insertion would be slightly slower, and iteration (and to_a) would
be slowed down more.

Hal