A use case for an ordered hash

There have been numerous occasions when I wanted an
ordered hash, but usually I can’t remember to write
them down.

Here’s just one.

Once I wanted a “dynamic case statement” of sorts.
I wanted to use procs as values in a hash. Something
like:

actions = { /abcd/  => lambda { do_this },
            /xyz/   => lambda { do_that },
            /abc/   => lambda { other }}

Then I could just iterate through the keys looking
for a match, then use that key to find the associated
proc and call it.

However, I quickly noticed that this is no good. The
order of iteration is unpredictable. So I couldn’t
guarantee that (for example) /abcd/ would be tested
before /abc/, and so on.

So yeah, I ended up using an array of arrays. But it
just felt wrong.

Hal

Hal F. wrote:

actions = { /abcd/ => lambda { do_this },
before /abc/, and so on.

So yeah, I ended up using an array of arrays. But it
just felt wrong.

Hal

That sort of code is what Chuck Moore evolved into Forth. :slight_smile: Seriously,
though, isn’t there a way to do this with a single array? Store the
patterns followed by the lambdas in sequence. Find the first index that
matches the pattern and take the lambda at [index+1]? Or two parallel
arrays? I use two parallel arrays for this sort of thing a lot – it’s
easy for me to read, given my FORTRAN background. :slight_smile:

M. Edward (Ed) Borasky wrote:

That sort of code is what Chuck Moore evolved into Forth. :slight_smile: Seriously,
though, isn’t there a way to do this with a single array? Store the
patterns followed by the lambdas in sequence. Find the first index that
matches the pattern and take the lambda at [index+1]? Or two parallel
arrays? I use two parallel arrays for this sort of thing a lot – it’s
easy for me to read, given my FORTRAN background. :slight_smile:

Yes, I could use a single flat array… I could also program
in FORTRAN… :wink:

Hal

Francis C. wrote:

actions[k].call
break
end
}

A more-greedy regex sorts lexicographically behind a head-matching
less-greedy one.

Clever workaround. Thanks for that.

There are all kinds of workarounds for situations like these.

I don’t need an ordered hash. I don’t even need a case
statement, as I could do the same thing with a bunch of ifs.
And so on.

Hal

Hal F. wrote:

actions = { /abcd/ => lambda { do_this },
before /abc/, and so on.

So yeah, I ended up using an array of arrays. But it
just felt wrong.

To me a Hash feels wrong. Why? Because you don’t make any use of hash
properties for fast lookup. You just iterate in plain order.

Kind regards

robert

Absolutely… an Array is perfect for ordered information.

actions = [
{ /abcd/ => lambda { do_this } },
{ /xyz/ => lambda { do_that } },
{ /abc/ => lambda { other } }
]

Then, all you do is…

actions.select { |action| action[input] }

where input = /xyz/ you will receive the appropriate hash (key and
Proc) in an array. So, to do something with that, you could…

actions.select { |action| action[input] }.first[input].call

which calls the Proc associated with the matching element.

I’m not too thrilled with its convolutedness, and I’m sure there’s a
better way, but I’m pretty sleepy, so this is the best I’m coming up
with right now. :slight_smile: As always, you can write a method to hide the gory
details.

def exec_route input
actions.select { |action| action[input] }.first[input].call
end

and just call:

exec_route /xyz/

and blam! do_that is executed!

Hope this helps.

M.T.

On 8/13/06, Hal F. [email protected] wrote:

actions = { /abcd/  => lambda { do_this },
            /xyz/   => lambda { do_that },
            /abc/   => lambda { other }}

actions.keys.sort {|a,b| b.to_s <=> a.to_s}.each {|k|
if k =~ your_string
actions[k].call
break
end
}

A more-greedy regex sorts lexicographically behind a head-matching
less-greedy one.

On Sun, Aug 13, 2006 at 06:31:07PM +0900, Matt T. wrote:

actions.select { |action| action[input] }

where input = /xyz/ you will receive the appropriate hash (key and
Proc) in an array. So, to do something with that, you could…

actions.select { |action| action[input] }.first[input].call

which calls the Proc associated with the matching element.

That’s not what he wants though (the point is matching against the
Regexp —
Hash#[] doesn’t buy you anything there).

Maybe

require ‘enumerator’
def ASSOC(x); x.enum_for(:each_slice, 2).to_a end

actions = ASSOC [
/abcd/, lambda{ 1 },
/xyz/, lambda{ 2 },
/abc/, lambda{ 3 }
]

Then he’s got to

def run(command, actions)
actions.each{|re, action| return action[command] if re =~ command}
nil
end

run(“abcdef”, actions) # => 1
run(“abcf”, actions) # => 3

But this would be nicer

class Array
def cassoc(x) # “case assoc”
each{|arr| return arr if arr[0] === x}
nil
end
end

re, action = actions.cassoc(“abcdef”)
action and action.call # => 1
re, action = actions.cassoc(“abcx”)
action and action.call # => 3

On Aug 13, 2006, at 3:25 AM, Robert K. wrote:

           /abc/   => lambda { other }}

To me a Hash feels wrong. Why? Because you don’t make any use of
hash properties for fast lookup. You just iterate in plain order.

He is making use of a Hash property. He wants the keys to be
unique. None of the other solutions shown in this thread have
addressed that.

James Edward G. II

On Mon, Aug 14, 2006 at 12:27:07AM +0900, James Edward G. II wrote:

On Aug 13, 2006, at 3:25 AM, Robert K. wrote:

Hal F. wrote:

Once I wanted a “dynamic case statement” of sorts.
I wanted to use procs as values in a hash. Something
like:
actions = { /abcd/ => lambda { do_this },
/xyz/ => lambda { do_that },
/abc/ => lambda { other }}
[…]

To me a Hash feels wrong. Why? Because you don’t make any use of
hash properties for fast lookup. You just iterate in plain order.

He is making use of a Hash property. He wants the keys to be
unique. None of the other solutions shown in this thread have
addressed that.

Hal is making the case for literal “ordered hashes”. The keys will be
unique because he’s writing them himself manually. I don’t think he
wants to
make use of this:

actions = { /a/ => 1, # <–
/b/ => 2, #
/a/ => 4 # <–±- why do this in the first place.
}
actions[/a/] # => 4

The way he uses the structure is reminiscent of Array#assoc. The only
thing
a Hash has going for it in this case is the cleaner notation and the
arrow.
But
actions = ASSOC [/abcd/, lambda{ … },
/abc/, lambda{ … } ]
(using parentheses if you want) looks good enough for me, and something
like
the Array#cassoc I showed before would do fine in this case.

Robert K. wrote:

So yeah, I ended up using an array of arrays. But it
just felt wrong.

To me a Hash feels wrong. Why? Because you don’t make any use of hash
properties for fast lookup. You just iterate in plain order.

I never use a hash for fast lookup. I use it because it
allows me easily to associate an arbitrary key with a
value.

Hal

Mauricio F. wrote:

action and action.call # => 1
re, action = actions.cassoc(“abcx”)
action and action.call # => 3

Now, that’s clever. That’s probably the least painful
solution I’ve seen using arrays.

I still wish we had a so-called ordered hash. :wink:

Hal

On Aug 13, 2006, at 5:03 AM, Mauricio F. wrote:

But this would be nicer

class Array
def cassoc(x) # “case assoc”
each{|arr| return arr if arr[0] === x}
nil
end
end

You could use find() there, right?

class Array
def cassoc(match)
find { |arr| arr.first === match }
end
end

James Edward G. II

Hal F. wrote:

I never use a hash for fast lookup. I use it because it
allows me easily to associate an arbitrary key with a
value.

Um… So you’re saying that you wouldn’t bother if lookups were O(n)?
Wow!

Kind regards

robert

Tom W. wrote:

http://raa.ruby-lang.org/project/orderedhash/

I knew we were reaching that point in the fugue
again. :wink:

I wrote something similar to this six years ago.

This is not a solution for me because there is no
convenient notation for literals.

Hal

Robert K. wrote:

Hal F. wrote:

I never use a hash for fast lookup. I use it because it
allows me easily to associate an arbitrary key with a
value.

Um… So you’re saying that you wouldn’t bother if lookups were O(n)?
Wow!

I think you mean “wouldn’t care”?

Anyway: I’m happy that lookups are fast. But 99% of the time,
my hashes are small enough that the time savings is probably
very small in terms of program execution time.

If I were storing 20 million elements, I wouldn’t use a Hash
either. I’d either use a database or flat file; or if I
needed it in RAM I might customize something (or likely use
the red-black tree implementation).

I’m a late-optimizer. Until you mentioned it now, I had never
thought of speed as a reason to use a hash. (Although, if you
had asked me, I’d have said it was faster than O(n), of course.)

Hal

Hal F. wrote:

I still wish we had a so-called ordered hash. :wink:

Hal

We do:

http://raa.ruby-lang.org/project/orderedhash/

Tom

Hal F. wrote:

I think you mean “wouldn’t care”?
Probably. I’m not a native speaker so I’m certainly not authoritative
on this matter. :slight_smile:

Anyway: I’m happy that lookups are fast. But 99% of the time,
my hashes are small enough that the time savings is probably
very small in terms of program execution time.

That probably also depends on the frequency of lookups. :slight_smile: But yes, my
hashes are typically larger (although I also use small ones).

If I were storing 20 million elements, I wouldn’t use a Hash
either. I’d either use a database or flat file; or if I
needed it in RAM I might customize something (or likely use
the red-black tree implementation).

I’m a late-optimizer. Until you mentioned it now, I had never
thought of speed as a reason to use a hash. (Although, if you
had asked me, I’d have said it was faster than O(n), of course.)

Interesting.

GoOd (n)ight

robert

:wink:

On 8/14/06, Hal F. [email protected] wrote:

I think you mean “wouldn’t care”?

Anyway: I’m happy that lookups are fast. But 99% of the time,
my hashes are small enough that the time savings is probably
very small in terms of program execution time.

Unfortunately for you some of us do things that involve searching
through 30,000 sales records to pull out Joe Bob’s statistics.
Sometimes you just don’t want to do that on the production database,
it just uses far too many resources. Using an ordered hash for one
person’s benefit would kill the speed and jack up memory usage.

I know an ordered associative array (which I think is a better
description of the functionality) would make some things very easy,
but replacing the Hash is not the way to go.

Something like this would probably sit better, and is still shorter
than the PHP equiv.:

a = [[‘key 1’,‘1’],[‘key 2’, ‘2’],[‘key 3’,3]].toAssoc

a[‘key 1’]

=> 1

p a

=> [[‘key 1’,‘1’],[‘key 2’, ‘2’],[‘key 3’,3]]

On Sun, 13 Aug 2006, Hal F. wrote:

There have been numerous occasions when I wanted an
ordered hash, but usually I can’t remember to write
them down.

I’m in the habit of automagically writing…

hash.keys.sort

or sometimes

hash.keys.sort_by{|k| hash[k]}.each{|k| value=hash[k]; … }

It’s just such a common idiom in my code I think I should push it into
my little collection of standard extensions to standard objects.

Why is it so common? Nobody wants a script that does very things in a
very different order depending on very small changes.

  • You just spend too long explaining to people why this happens.
  • Makes sporadic bugs even harder to find (it works for me, but customer
    is whinging)
  • The results tend not to be comparable (can’t diff) or repeatable.

John C. Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : [email protected]
New Zealand

Carter’s Clarification of Murphy’s Law.

“Things only ever go right so that they may go more spectacularly wrong
later.”

From this principle, all of life and physics may be deduced.