Elegant expression of sorting

dubstep · January 19, 2012, 1:59am

Hi all,

I’ve found that for the current project I am working on that I am
frequently doing sorts based on multiple keys. The comparisons on each
key tend to vary, ie. sometimes I want descending rather than ascending,
and other times the comparison itself is complex.

I usually do something like this:

foo.sort! {|a,b|

Ascending, primary.

rv = a.key0 <=> b.key0

Descending, secondary.

rv = b.key1 <=> a.key1 if rv == 0

A tricky and expensive comparison, tertiary.

rv = a.magic(b) if rv == 0
rv
}

Does anyone have any suggestions as to a more elegant way to express
this? I’m after something that is logically equivalent, but easier to
write and clearer to read.

Garth

Garthy_D · January 19, 2012, 7:06am

On Thu, Jan 19, 2012 at 8:58 AM, Garthy D
[email protected] wrote:

Ascending, primary.

clearer to read.

Garth

A general way to perform a multi-key sort is:

assuming sort() is a stable sort algorithm, e.g. merge sort

foo.sort! {|a,b|
# starting from the least important key
a.magic(b)
}
foo.sort! {|a,b|
# sort again using the second-least important key
b.key1 <=> a.key1
}

repeat until you are done with the most important key

I guess you can generalize this using an array of Lambdas.

One potential disadvantage of this for your particular case is that,
a.magic(b) is always performed even if a.key0 != b.key0. As you have
stated, this is quite expensive, and can make the above general scheme
much more expensive than your way.

Garthy_D · January 19, 2012, 8:24am

On Thu, Jan 19, 2012 at 8:58 AM, Garthy D
[email protected] wrote:

easier to write and clearer to read.
Garth

try,

foo.sort{|a,b|
((rv=a.key0<=>b.key0)==0) &&
((rv=b.key1<=>a.key1)==0) &&
(rv=a.magic(b))
rv
end

kind regards -botp

Garthy_D · January 19, 2012, 10:09am

Hi Yong Li,

On 19/01/12 16:35, Yong Li wrote:

Does anyone have any suggestions as to a more elegant way to express this?
a.magic(b)
a.magic(b) is always performed even if a.key0 != b.key0. As you have
stated, this is quite expensive, and can make the above general scheme
much more expensive than your way.

Cool- thanks for that. I won’t be able to use it directly as the last
key sort for most of the things I am doing is typically expensive, but
I’m still quite interested in different possible approaches. Thankyou.

Garth

Garthy_D · January 19, 2012, 10:17am

Hi botp,

On 19/01/12 17:53, botp wrote:

((rv=b.key1<=>a.key1)==0)&&
(rv=a.magic(b))
rv
end

kind regards -botp

Very nice use of short-circuiting. It made me think of Bourne shell
scripting.

Garth

Garthy_D · January 19, 2012, 10:22am

On Thu, Jan 19, 2012 at 10:09 AM, Garthy D
[email protected] wrote:

vary, ie. sometimes I want descending rather than ascending, and other

A tricky and expensive comparison, tertiary.

rv = a.magic(b) if rv == 0
rv
}

Does anyone have any suggestions as to a more elegant way to express
this?
I’m after something that is logically equivalent, but easier to write and
clearer to read.

repeat until you are done with the most important key

I guess you can generalize this using an array of Lambdas.
One potential disadvantage of this for your particular case is that,
a.magic(b) is always performed even if a.key0 != b.key0. As you have
stated, this is quite expensive, and can make the above general scheme
much more expensive than your way.

This is totally inefficient because it goes through the original data
set (which might be large) multiple times. The complexity of #magic
only adds to this.

Cool- thanks for that. I won’t be able to use it directly as the last key
sort for most of the things I am doing is typically expensive, but I’m still
quite interested in different possible approaches. Thankyou.

Please don’t, there are much better approaches (see botp’s for
example). Here are more

foo.sort! {|a,b|

Ascending, primary.

(a.key0 <=> b.key0).nonzer0? ||

Descending, secondary.

(b.key1 <=> a.key1).nonzero? ||

A tricky and expensive comparison, tertiary.

a.magic(b)
}

foo.sort_by {|a|
[

Ascending, primary.

a.key0,

Descending, secondary, only if numeric.

-a.key1,

A tricky and expensive comparison, tertiary.

a.create_magic_key
]
}

All these approaches can also be stored in a lambda and used from there

YourClass::SORT_FOO = lambda {|a| [a.key0, -a.key1, a.create_magic_key]}
YourClass::SORT_BAR = lambda {|a,b| (a.key0 <=> b.key0).nonzero? || …
}

enum.sort_by &YourClass::SORT_FOO
enum.sort &YourClass::SORT_BAR

Kind regards

robert

Garthy_D · January 19, 2012, 11:50am

Hi Robert,

On 19/01/12 19:52, Robert K. wrote:

frequently

Descending, secondary.

repeat until you are done with the most important key

I guess you can generalize this using an array of Lambdas.
One potential disadvantage of this for your particular case is that,
a.magic(b) is always performed even if a.key0 != b.key0. As you have
stated, this is quite expensive, and can make the above general scheme
much more expensive than your way.

This is totally inefficient because it goes through the original data
set (which might be large) multiple times. The complexity of #magic
only adds to this.

Very true.

Cool- thanks for that. I won’t be able to use it directly as the last key
sort for most of the things I am doing is typically expensive, but I’m still
quite interested in different possible approaches. Thankyou.

Please don’t, there are much better approaches (see botp’s for
example). Here are more

Yes, it’s not directly suitable for what I’m doing but the different
approach was interesting- there are cases where I might be able to have
the input come in already sorted against one of the keys without cost.
It got me thinking, anyway.

foo.sort! {|a,b|

Ascending, primary.

(a.key0<=> b.key0).nonzer0? ||

Descending, secondary.

(b.key1<=> a.key1).nonzero? ||

A tricky and expensive comparison, tertiary.

a.magic(b)
}

Very, very neat; and nice find on “nonzero?”. My first thought was
that “nonzero?” surely returns a boolean and this would break it- but it
looks like “nonzero?” with numeric return was designed for this exact
problem.

I was wondering if there was a nice way to drop the “rv” from my
solution- this looks like the best way.

foo.sort_by {|a|
[

Ascending, primary.

a.key0,

Descending, secondary, only if numeric.

-a.key1,
# A tricky and expensive comparison, tertiary.
a.create_magic_key
]
}

I’d stumbled on sort_by initially, I suspect I’ll use it when I can
precompute the comparisons easily (this is sometimes the case, sometimes
not).

All these approaches can also be stored in a lambda and used from there

YourClass::SORT_FOO = lambda {|a| [a.key0, -a.key1, a.create_magic_key]}
YourClass::SORT_BAR = lambda {|a,b| (a.key0<=> b.key0).nonzero? || … }

enum.sort_by&YourClass::SORT_FOO
enum.sort&YourClass::SORT_BAR

Excellent stuff.

I’m not at all familiar with the “enum.sort_by&YourClass::SORT_FOO”
syntax at this point, so I might need to read up on that. I can at least
partly guess from the context though.

Thankyou for your very detailed input on this one, excellent stuff!

Garth

Garthy_D · January 20, 2012, 3:15am

On Thu, Jan 19, 2012 at 5:22 PM, Robert K.
[email protected] wrote:

On Thu, Jan 19, 2012 at 10:09 AM, Garthy D
[email protected] wrote:

On 19/01/12 16:35, Yong Li wrote:

repeat until you are done with the most important key

example). Here are more
Wow, big bow to Robert, I really like reading your posts on this
mailing list. They always teach me something.

I came across this ‘generic’ multi-key sorting algorithm from a
textbook. It looked interesting to me because I was wondering why I
had never seen it in codes. Now, it seems obvious why I had never seen
it - it is inefficient! lucky that I never used this algorithm in real
life. wheww…

Garthy_D · January 20, 2012, 7:07am

Hi Robert,

On 19/01/12 23:26, Robert K. wrote:

I’d stumbled on sort_by initially, I suspect I’ll use it when I can
precompute the comparisons easily (this is sometimes the case, sometimes
not).

Note, with #sort_by you do not precompute comparisons but you compute
a comparison key! In the example it is an Array because Array#<=>
uses fields from the beginning to the end, i.e. sorts according to
fields with decreasing precedence (first pos 0, if that comparison
returns 0 field pos 1 etc.).

I just reread my previous response and my explanation was really poor-
sorry about that. Thankyou for the clarification, it matches my
understanding, and expresses it far better than I did.

irb(main):002:0> a.map {|x| x.to_s}
=> [“0”, “1”, “2”, “3”, “4”]
irb(main):003:0> a.map(&:to_s)
=> [“0”, “1”, “2”, “3”, “4”]

Excellent, thankyou. The additional example solidified my
understanding of it too. Thankyou for taking the additional time to
explain this, much appreciated.

Garth

Garthy_D · January 19, 2012, 1:57pm

On Thu, Jan 19, 2012 at 11:49 AM, Garthy D
[email protected] wrote:

On 19/01/12 19:52, Robert K. wrote:

Very, very neat; and nice find on “nonzero?”. My first thought was that
“nonzero?” surely returns a boolean and this would break it- but it looks
like “nonzero?” with numeric return was designed for this exact problem.

Right.

A tricky and expensive comparison, tertiary.

a.create_magic_key
]
}

I’d stumbled on sort_by initially, I suspect I’ll use it when I can
precompute the comparisons easily (this is sometimes the case, sometimes
not).

Note, with #sort_by you do not precompute comparisons but you compute
a comparison key! In the example it is an Array because Array#<=>
uses fields from the beginning to the end, i.e. sorts according to
fields with decreasing precedence (first pos 0, if that comparison
returns 0 field pos 1 etc.).

I’m not at all familiar with the “enum.sort_by&YourClass::SORT_FOO” syntax
at this point, so I might need to read up on that. I can at least partly
guess from the context though.

It just passes the lambda as a block as if it the block was passed
directly. Note, that technically behind the scenes the syntax foo(&x)
invokes x.to_proc which is also provided by Symbol:

irb(main):001:0> a = 5.times.to_a
=> [0, 1, 2, 3, 4]
irb(main):002:0> a.map {|x| x.to_s}
=> [“0”, “1”, “2”, “3”, “4”]
irb(main):003:0> a.map(&:to_s)
=> [“0”, “1”, “2”, “3”, “4”]

Thankyou for your very detailed input on this one, excellent stuff!

You’re welcome!

Kind regards

robert