Sort_by: multiple fields with reverse sort

Dobai-Pataky_BSSSSl · October 13, 2010, 9:46am

I need to use sort_by to sort a table, since the user could select
columns in any order.

b=[ ["radio", 30, 5], ["radio", 20, 5], ["archie", 20, 5],

[“newton”, 10, 3] ]
b.sort_by{|x| [ x[1], x[0] ]}

Works fine. However, often the user will want a reverse sort on some
field. I do not off-hand know the datatype of the field, but in most
cases it will be a String.

I tried:

b.sort_by{|x| [ !x[1] ]}

This works (one criteria), but this does not:
b.sort_by{|x| [ !x[1], x[0] ]}

Another thread here, suggested using a minus for Numbers but what about
Strings.
Thx, rahul.

sentinel · October 13, 2010, 10:11am

On Wednesday 13 October 2010, Rahul K. wrote:

|
|I tried:
|
| b.sort_by{|x| [ !x[1] ]}
|This works (one criteria), but this does not:
| b.sort_by{|x| [ !x[1], x[0] ]}
|
|Another thread here, suggested using a minus for Numbers but what about
|Strings.
|Thx, rahul.

Are you sure the one criterium version works as you think? !x[1] returns
a
boolean value (true if x[1] is false or nil and false otherwise). So, if
the
contents of the arrays are all strings or numbers, it’ll compare items
by
comparing equal arrays only containing the element false. Most likely,
this
would give you the elements in the same order.

I don’t know how to do what you want using sort_by, but you can do that
using
sort. For example, the following code does a reverse sort on the 1 index
and a
normal sort on the 0 index

b.sort do |a, b|
res = -(a[1] <=> b[1])
res = a[0] <=> b[0] if res == 0
res
end

For more information, look at the documentation for Enumerable#sort and
the
Comparable module (for the <=> operator).

I hope this helps

Stefano

sentinel · October 13, 2010, 11:09am

The one way to do a reverse sort is wrapping a sort target in a class
for a reverse sort.
For instance:

This class is a class for a reverse sort.

An Object of various datatypes can be wrapped in an object of this

class.
class ItemForReverseSort
def initialize( item )
@item = item
end
def item
@item
end
def <=>( target )
( self.item <=> target.item ) * (-1)
end
end

b = [ [“radio”, 30, 5], [“radio”, 20, 5], [“archie”, 20, 5], [“newton”,
10, 3] ]

normal sort

p b.sort_by{|x| [ x[1], x[0] ]}
#=> [[“newton”, 10, 3], [“archie”, 20, 5], [“radio”, 20, 5], [“radio”,
30, 5]]

normal sort for 1st item, and reverse sort for 2nd item

p b.sort_by{|x| [ x[1], ItemForReverseSort.new(x[0]) ]}
#=> [[“newton”, 10, 3], [“radio”, 20, 5], [“archie”, 20, 5], [“radio”,
30, 5]]

2010/10/13 Rahul K. [email protected]:

sentinel · October 13, 2010, 2:22pm

Y. NOBUOKA wrote in post #949783:

The one way to do a reverse sort is wrapping a sort target in a class
for a reverse sort.
For instance:

Thanks, i will have to consider this.

To the previous reply,

Are you sure the one criterium version works as you think? !x[1] returns
a
boolean value

Frankly, i am not sure. i just tried it on irb.

My issue is that i do not know how many columns a user will sort on, and
what combination. So the normal sort() appears cumbersome, unless i can
dynamically build a string and eval() it. Not sure i know how to do
that.

sort_by allows me to give a list as argument, so that appears suitable.

sentinel · October 13, 2010, 2:53pm

When i sat down to code, i realized that i could not just put the
sort_keys list to sort_by, i would have to unroll the sort_keys array.
So, i might as well use sort. This seems to work in the small case i
have. Could someone comment on this, and suggest cleaner better way to
do. Thanks.

b=[ [“radio”, 30, 5], [“radio”, 20, 5], [“archie”, 20, 5], [“newton”,
10, 3] ]
sort_keys=[1,0]
rev_flag = [true, false]

c=b.sort{|x,y|
res = 0
sort_keys.each_with_index { |e,i|
if rev_flag[i]
res = y[e] <=> x[e]
else
res = x[e] <=> y[e]
end
break if res != 0
}
res
}
c

sentinel · October 13, 2010, 4:00pm

On Oct 13, 2010, at 8:53 AM, Rahul K. wrote:

res
}
c

–
Posted via http://www.ruby-forum.com/.

While it is not strictly equivalent to array_of_strings.sort.reverse,
this little addition to String allows array_of_strings.sort_by{|s|-s}

class String
RAB_REPLACE = (‘a’…‘z’).to_a.reverse.join.freeze
def -@
self.tr(‘a-z’,RAB_REPLACE)
end
end

Then you can do:

b.sort_by {|ary| [ary[1], -ary[0]] }

irb> b.sort_by {|ary| [ary[1], -ary[0]] }
=> [[“newton”, 10, 3], [“radio”, 20, 5], [“archie”, 20, 5], [“radio”,
30, 5]]

If your strings are always simple words, then this may be enough for
you (or at least give you an idea).

If your array is short (i.e., the difference between sort and sort_by
is acceptable), then you can always do:

irb> b.sort {|e1,e2| (e1[1] <=> e2[1]).nonzero? || e2[0] <=> e2[0] }
=> [[“newton”, 10, 3], [“radio”, 20, 5], [“archie”, 20, 5], [“radio”,
30, 5]]

( which doesn’t rely on String#-@ )

-Rob

Rob B.
[email protected] http://AgileConsultingLLC.com/
[email protected] http://GaslightSoftware.com/

sentinel · October 14, 2010, 4:22am

On Oct 13, 5:53am, Rahul K. [email protected] wrote:

res}

I definitely feel that sort_by has a compelling advantage over sort
for any large-ish dataset involving keys with nontrivial
transformations from the original object. You obviously want to avoid
doing NlogN transformations when N would suffice.

Does Ruby’s sort_by method guarantee a stable sort?

If it does, then you could do sort_by the first field (reversing as
needed), then sort_by the second field (reversing as needed), then
sort_by the third field (reversing as needed), and so on. Doing M
sort passes on 1 key is no less efficient than doing 1 sort pass on M
keys, except to the extent that the latter maybe short-circuits a few
comparisons. At worst you are talking an M-ish difference that is
mostly dwarfed by NlogN.

Your essential underlying question about representing the “inverse” of
a key is pretty intriguing. I don’t know of any general way to solve
that problem, other than encapsulating it in a class, as has been
suggested.

sentinel · October 14, 2010, 2:47am

On Oct 13, 2010, at 02:08 , Y. NOBUOKA wrote:

def item
@item
end
def <=>( target )
( self.item <=> target.item ) * (-1)
end
end

I like this, but think it is more complex than it needs to be.

module ReverseSort
def <=> target
-super
end
end

class Object
def -@
self.dup.extend ReverseSort
end
end

b = [ [“radio”, 30, 5], [“radio”, 20, 5], [“archie”, 20, 5], [“newton”,
10, 3] ]

p b.sort_by{|x| [ x[1], -x[0] ]}

=> [[“newton”, 10, 3], [“radio”, 20, 5], [“archie”, 20, 5], [“radio”,

30, 5]]

sentinel · October 14, 2010, 4:56am

On Oct 13, 12:46am, Rahul K. [email protected] wrote:

I tried:

b.sort_by{|x| [ !x[1] ]}
This works (one criteria), but this does not:
b.sort_by{|x| [ !x[1], x[0] ]}

Another thread here, suggested using a minus for Numbers but what about
Strings.
Thx, rahul.

It would be nice if Ruby supported a sort_by on steroids.

sorted_list = list.multi_field_sort_by(
{ |x| x.department.name },
{ |x| x.position.name },
desc { |x| x.level },
desc { |x| x.salary ? x.salary : x.rate * db_lookup(x,
‘hours_worked’) }
)

I believe you could write a decent multi_field_sort_by in Ruby that
would be efficient for large enough lists to outperform more tedious
approaches, but it would be even better if Ruby natively supported it.

My proposed syntax might be slightly off, but you get the idea. You’d
pass a list of blocks that represent the successive tiebreakers, and
multi_field_sort_by would presumably cache the results from each
transformation, evaluating the blocks only as necessary. The “desc”
thingy would actually produce some kind of wrapper that
multi_field_sort_by could introspect to know that it needs to apply a
particular tiebreaker in reverse order.

sentinel · October 14, 2010, 4:10pm

On Oct 13, 10:50pm, Jeremy B. [email protected] wrote:

desc { |x| x.salary ? x.salary : x.rate * db_lookup(x,
transformation, evaluating the blocks only as necessary. The “desc”

This is a general solution which is relatively inefficient than

end
break res unless res == 0

Sort over the second and then first columns each in normal order.

p a.sort_by(&key)

=> [[“newton”, 10, 3], [“archie”, 20, 5], [“radio”, 20, 5],

[“radio”, 30, 5]]

Sort over the second column in normal order and then the first

column in reverse order.

p a.sort_by(cmp, &key)

=> [[“newton”, 10, 3], [“radio”, 20, 5], [“archie”, 20, 5],

[“radio”, 30, 5]]

I think it’s a good start, but it still requires you to write a
somewhat customized cmp function. A truly optimized multi-field sort
might also be able to avoid calculating all elements of the key as
well, since they are often only needed as tiebreakers.

sentinel · October 14, 2010, 5:06pm

On Oct 13, 12:46am, Rahul K. [email protected] wrote:

I tried:

b.sort_by{|x| [ !x[1] ]}
This works (one criteria), but this does not:
b.sort_by{|x| [ !x[1], x[0] ]}

Another thread here, suggested using a minus for Numbers but what about
Strings.

Here is a solution that allows you to lazily evaluate keys and specify
that certain keys are to be reversed.

module Enumerable
def sort_by(*key_methods)
# allow for multiple key_methods and only
# evaluate them when they are truly needed
# for the sort
def compare(a,b, key_methods)
i = 0
while i < key_methods.size do
for elem in [a, b] do
fields = elem[0]
key = elem[1]
if fields.size <= i
fields[i] = key_methods[i][0].call(key)
end
end
if key_methods[i][1] == :desc
a, b = b, a
end
result = (a[0][i] <=> b[0][i])
return result unless result == 0
i += 1
end
return result
end

  self.collect do |item|
    [ [], item ]
  end.sort do |a, b|
    compare(a, b, key_methods)
  end.collect do |kv|
    kv[1]
  end
end

end

a = [
[“radio”, 30, 5],
[“radio”, 40, 5],
[“radio”, 20, 5],
[“archie”, 20, 5],
[“newton”, 10, 3]
]

puts a.sort_by(
[Proc.new { |a| a[0] }],
[Proc.new { |a| a[1] }, :desc]
).inspect

sentinel · October 14, 2010, 7:58am

On 10/13/2010 09:50 PM, Steve H. wrote:

I believe you could write a decent multi_field_sort_by in Ruby that
would be efficient for large enough lists to outperform more tedious
approaches, but it would be even better if Ruby natively supported it.

My proposed syntax might be slightly off, but you get the idea. You’d
pass a list of blocks that represent the successive tiebreakers, and
multi_field_sort_by would presumably cache the results from each
transformation, evaluating the blocks only as necessary. The “desc”
thingy would actually produce some kind of wrapper that
multi_field_sort_by could introspect to know that it needs to apply a
particular tiebreaker in reverse order.

How about something like this:

module Enumerable

sort_by will take a key generator and an optional comparator

and perform a Schwartzian Transform over the data.

This is a general solution which is relatively inefficient than

a purpose-built sorting function if the keys are trivial to

generate.

def sort_by(cmp = lambda { |a, b| a <=> b }, &key)
collect do |item| # Generate keys from the list items.
[key[item], item]
end.sort do |a, b| # Sort the keys.
cmp[a[0], b[0]]
end.collect do |kv| # Return the items in key sort order.
kv[1]
end
end
end

This will cause the sort to operate over the second and then

the first column.

key = lambda { |item| [item[1], item[0]] }

This will cause the sort to operate normally on the second

column and in reverse on the first column.

cmp = lambda do |a, b|
res = a[0] <=> b[0]
break res unless res == 0
b[1] <=> a[1]
end

Sample data from Ryan’s post.

a = [
[“radio”, 30, 5],
[“radio”, 20, 5],
[“archie”, 20, 5],
[“newton”, 10, 3]
]

Sort over the second and then first columns each in normal order.

p a.sort_by(&key)

=> [[“newton”, 10, 3], [“archie”, 20, 5], [“radio”, 20, 5],

[“radio”, 30, 5]]

Sort over the second column in normal order and then the first

column in reverse order.

p a.sort_by(cmp, &key)

=> [[“newton”, 10, 3], [“radio”, 20, 5], [“archie”, 20, 5],

[“radio”, 30, 5]]

sentinel · October 14, 2010, 5:36pm

On Oct 14, 8:00am, Steve H. [email protected] wrote:

Another thread here, suggested using a minus for Numbers but what about
def compare(a,b, key_methods)
a, b = b, a
end.sort do |a, b|
[“radio”, 20, 5],
[“archie”, 20, 5],
[“newton”, 10, 3]
]

puts a.sort_by(
[Proc.new { |a| a[0] }],
[Proc.new { |a| a[1] }, :desc]
).inspect

Oops, I forgot to verify that I was actually caching results. Here is
a fixed version.

module Enumerable
def sort_by_multi(*key_methods)
# allow for multiple key_methods and only
# evaluate them when they are truly needed
# for the sort
def compare(a,b, key_methods)
i = 0
while i < key_methods.size do
for elem in [a, b] do
key = elem[1]
if elem[0].size <= i
elem[0] << key_methods[i][0].call(key)
puts " #{elem.inspect}"
end
end
if key_methods[i][1] == :desc
a, b = b, a
end
result = (a[0][i] <=> b[0][i])
return result unless result == 0
i += 1
end
return result
end

  self.collect do |item|
    [ [], item ]
  end.sort do |a, b|
    compare(a, b, key_methods)
  end.collect do |kv|
    kv[1]
  end
end

end

a = [
[“radio”, 30, 5],
[“radio”, 40, 5],
[“radio”, 20, 5],
[“archie”, 20, 5],
[“newton”, 10, 3]
]

puts a.sort_by_multi(
[Proc.new { |a| a[0] }],
[Proc.new { |a| a[1] }, :desc]
).inspect

You can see that it only evaluates the second key for tiebreaker
purposes. (Of course, in this example, the key calculation is
trivial, but in the real world you might have a key that is expensive
to evaluate.)

[[“radio”], [“radio”, 30, 5]]
[[“radio”], [“radio”, 20, 5]]
[[“radio”, 30], [“radio”, 30, 5]]
[[“radio”, 20], [“radio”, 20, 5]]
[[“newton”], [“newton”, 10, 3]]
[[“radio”], [“radio”, 40, 5]]
[[“radio”, 40], [“radio”, 40, 5]]
[[“archie”], [“archie”, 20, 5]]
[[“archie”, 20, 5], [“newton”, 10, 3], [“radio”, 40, 5], [“radio”, 30,
5], [“radio”, 20, 5]]

sentinel · October 14, 2010, 10:26pm

On Oct 14, 2010, at 08:35 , Steve H. wrote:

module Enumerable
def sort_by_multi(*key_methods)
def compare(a,b, key_methods)

This doesn’t do what you think (or at least imply that) it does.

sentinel · October 15, 2010, 3:18am

On Oct 14, 1:25pm, Ryan D. [email protected] wrote:

On Oct 14, 2010, at 08:35 , Steve H. wrote:

module Enumerable
def sort_by_multi(*key_methods)
def compare(a,b, key_methods)

This doesn’t do what you think (or at least imply that) it does.

Care to explain?

puts a.sort_by_multi(
[Proc.new { |a| a[0] }],
[Proc.new { |a| a[1] }, :desc]
).inspect

$ ruby foo.rb

[[“archie”, 20, 5], [“newton”, 10, 3], [“radio”, 40, 5], [“radio”,
30, 5], [“radio”, 20, 5]]

$ cat foo.rb
module Enumerable
def sort_by_multi(*key_methods)
# allow for multiple key_methods and only
# evaluate them when they are truly needed
# for the sort
def compare(a,b, key_methods)
i = 0
while i < key_methods.size do
for elem in [a, b] do
key = elem[1]
if elem[0].size <= i
elem[0] << key_methods[i][0].call(key)
end
end
if key_methods[i][1] == :desc
a, b = b, a
end
result = (a[0][i] <=> b[0][i])
return result unless result == 0
i += 1
end
return result
end

  self.collect do |item|
    [ [], item ]
  end.sort do |a, b|
    compare(a, b, key_methods)
  end.collect do |kv|
    kv[1]
  end
end

end

a = [
[“radio”, 30, 5],
[“radio”, 40, 5],
[“radio”, 20, 5],
[“archie”, 20, 5],
[“newton”, 10, 3]
]

puts a.sort_by_multi(
[Proc.new { |a| a[0] }],
[Proc.new { |a| a[1] }, :desc]
).inspect

sentinel · October 15, 2010, 7:41am

On Oct 14, 1:25pm, Ryan D. [email protected] wrote:

On Oct 14, 2010, at 08:35 , Steve H. wrote:

module Enumerable
def sort_by_multi(*key_methods)
def compare(a,b, key_methods)

This doesn’t do what you think (or at least imply that) it does.

Ok, I found and fixed a bug (*) and reworked the example to
(hopefully) make the implied intent more clear.

rows = [
{:name => “al”, :salary => 40},
{:name => ‘charlie’, :salary => 100},
{:name => “al”, :salary => 50},
{:name => ‘diane’, :salary => 1000, :commission => 40},
{:name => ‘diane’, :salary => 1000, :commission => 30},
{:name => ‘ed’, :salary => 20},
]

def do_sql_like_sort(rows)
def asc(field)
[Proc.new { |row| row[field] }, :asc ]
end
def desc(field)
[Proc.new { |row| row[field] }, :desc ]
end
# The semantics for sort_by_multi are similar to
# SQL.
rows.sort_by_multi(
asc(:name),
desc(:salary),
desc(:commission)
)
end

expected_result = [
{:name=>“al”, :salary=>50},
{:name=>“al”, :salary=>40},
{:name=>“charlie”, :salary=>100},
{:name=>“diane”, :salary=>1000, :commission => 40},
{:name=>“diane”, :salary=>1000, :commission => 30},
{:name=>“ed”, :salary=>20},
]

module Enumerable
def sort_by_multi(*key_methods)
# allow for multiple key_methods and only
# evaluate them when they are truly needed
# for the sort
def compare(a,b, key_methods)
i = 0
while i < key_methods.size do
for elem in [a, b] do
key = elem[1]
if elem[0].size <= i
elem[0] << key_methods[i][0].call(key)
end
end
x, y = (key_methods[i][1] == :desc) ? [b, a] : [a, b]
result = (x[0][i] <=> y[0][i])
return result unless result == 0
i += 1
end
return result
end

  self.collect do |item|
    [ [], item ]
  end.sort do |a, b|
    compare(a, b, key_methods)
  end.collect do |kv|
    kv[1]
  end
end

end

sorted_rows = do_sql_like_sort(rows)
if sorted_rows != expected_result
raise ‘fail’
end

- The previous iteration of this code was swapping a and b instead
  of making copies before swapping. This broke the case where two of
  the fields were to be sorted in descending order.