Complex sort of matrix possible, e.g. like Excel?


#1

Hi,

I’ve got an array of rows (and thus a matrix) created user FasterCSV
to extract data from a CSV file. I’d like to sort the matrix on
column A ascending and, within that, column B descending. I looked at
Matrix, but it doesn’t seem to address that functionality. Is there
a package that does, or do I have to write my own SuperMatrix
inherited from Matrix?

Thanks in Advance,
Richard


#2

Enumerable#sort lets you do this fairly easily with just a pure array
of arrays, e.g., to sort the array-of-arrays “arr” by the first column
ascending and the second descending:

arr.sort {|a,b| [a[0]<=>b[0], b[1]<=>a[1]].find {|x| x!=0} || 0}


#3

2009/3/18 Christopher D. removed_email_address@domain.invalid:

Enumerable#sort lets you do this fairly easily with just a pure array
of arrays, e.g., to sort the array-of-arrays “arr” by the first column
ascending and the second descending:

arr.sort {|a,b| [a[0]<=>b[0], b[1]<=>a[1]].find {|x| x!=0} || 0}

Alternatively with less intermediate Arrays and less comparison
operations.

arr.sort do |a,b|
c = a[0]<=>b[0]
c == 0 ? b[1]<=>a[1] : c
end

You can as well do

arr.sort_by {|a,b| [a[0]<=>b[0], b[1]<=>a[1]]}

Cheers

robert


#4

On Mar 18, 1:58 am, Christopher D. removed_email_address@domain.invalid wrote:

I’ve got an array of rows (and thus a matrix) created user FasterCSV
to extract data from a CSV file. I’d like to sort the matrix on
column A ascending and, within that, column B descending. I looked at
Matrix, but it doesn’t seem to address that functionality. Is there
a package that does, or do I have to write my own SuperMatrix
inherited from Matrix?

Thanks in Advance,
Richard

Hey Christopher,

That’s perfect! I had faith that the community had dealt with this
issue.

Best wishes,
Richard


#5

On Mar 18, 6:16 am, Robert K. removed_email_address@domain.invalid wrote:

arr.sort do |a,b|
robert


remember.guy do |as, often| as.you_can - without end

Hi Robert,

As usual, you’ve got the perfect answer. Thank you very much.

Best wishes,
Richard


#6

On Mar 18, 2009, at 6:16 AM, Robert K. wrote:

You can as well do

arr.sort_by {|a,b| [a[0]<=>b[0], b[1]<=>a[1]]}

This doesn’t seem quite right to me. Shouldn’t it be:

arr.sort_by { |item| [item[0], item[1]] }

sort_by will use Array#<=> to compare the two element
array and Array#<=> simply uses <=> on each element.

Gary W.


#7

2009/3/18 Gary W. removed_email_address@domain.invalid:

On Mar 18, 2009, at 6:16 AM, Robert K. wrote:

You can as well do

arr.sort_by {|a,b| [a[0]<=>b[0], b[1]<=>a[1]]}

This doesn’t seem quite right to me. Shouldn’t it be:

arr.sort_by { |item| [item[0], item[1]] }

Yes, you’re right. Copy & paste error. However, your solution is not
fully correct either because it does not take into consideration that
order of the second column should be reversed. So you’d have to do

arr.sort_by { |item| [item[0], -item[1]] }

Thanks for the heads up!

robert


#8

On Mar 18, 2009, at 1:58 AM, Christopher D. wrote:

Enumerable#sort lets you do this fairly easily with just a pure array
of arrays, e.g., to sort the array-of-arrays “arr” by the first column
ascending and the second descending:

arr.sort {|a,b| [a[0]<=>b[0], b[1]<=>a[1]].find {|x| x!=0} || 0}

Not to detract to much from the other responses, but this ought to be:

arr.sort {|a,b| (a[0] <=> b[0]).nonzero? || b[1] <=> a[1] }

Take a look at what Numeric#nonzero? does. The docs specifically
mention its use when chaining comparisons this way.

Doing arr.sort_by {|a| [a[0], -a[1]] } only works if the second
element responds to @- (like any Numeric would, but certainly not
String).

-Rob

inherited from Matrix?

Thanks in Advance,
Richard

Rob B. http://agileconsultingllc.com
removed_email_address@domain.invalid


#9

On Mar 18, 12:31 pm, Rob B. removed_email_address@domain.invalid
wrote:

arr.sort {|a,b| (a[0] <=> b[0]).nonzero? || b[1] <=> a[1] }

On 3/17/09, RichardOnRails <removed_email_address@domain.invalid

Thanks in Advance,
Richard

Rob B. http://agileconsultingllc.com
removed_email_address@domain.invalid

On Mar 18, 12:31 pm, Rob B. removed_email_address@domain.invalid
wrote:

arr.sort {|a,b| (a[0] <=> b[0]).nonzero? || b[1] <=> a[1] }

On 3/17/09, RichardOnRails <removed_email_address@domain.invalid

Thanks in Advance,
Richard

Rob B. http://agileconsultingllc.com
removed_email_address@domain.invalid

Hi Rob,

Thanks for your response. I don’t want to be an expert on sorting
matrices. I just want to get my project working. (Don’t we all :slight_smile:

Here’s the essence of what I’ve got working, confirmed with debugging
puts’.

matrix = []
FasterCSV.foreach(selectedCsvFile, :headers => false) do |row|
matrix << row
end

I want (in Excel terms) the matrix sorted on column B asc. and within
that col. I asc. Both columns are textual. Based on your guidance, I
added the line:

sortedMatrix = matrix.sort {|a,b| [a[1]<=>b[1], a[8] <=> b[8]]}

Ruby gave me a syntax error:
ProcessExports.rb:130:in sort': undefined method>’ for [-1,
1]:Array (NoMethodError)

I’m hoping the problem is that I’m invoking Array::Sort rather than
Enumerable::Sort but nothing my deteriorating brain could devise
worked. Any ideas.

Best wishes,
Richard


#10

On Mar 19, 5:18 pm, Rob B. removed_email_address@domain.invalid
wrote:

of arrays, e.g., to sort the array-of-arrays “arr” by the first
Take a look at what Numeric#nonzero? does. The docs specifically

wrote:
inherited from Matrix?

Matrix, but it doesn’t seem to address that functionality. Is
Hi Rob,
end

Which translates to your:
need to call #to_a or #fields on your row.
Rob B. http://agileconsultingllc.com
removed_email_address@domain.invalid

Rob,

Thank you very much for hanging in there with me until the fog lifted.
I thought about your advice and managed to mangle it. If I hadn’t
been such a “wise guy”, I could have translated your original advice
as you did and been on my way.

Regarding the asc/desc on the I col., I changed my mind midway and
decided my app requires asc. on I.

But I have a final question: I thought your expression had a flaw
because, in my view, “(a[1]<=>b[1]).nonzero?”, if true, should cause
the block to return true rather than plus or minus.

So I concocted:
sortedMatrix = matrix.sort {|a,b| (v = a[1]<=>b[1]).nonzero? ? v : (a
[8] <=> b[8] )}
which works.

But yours works, too. So my understanding of how block evaluation
works when invoked by a calling function is flawed.

Do you have a simple explanation of the flaw in my thinking, or can
you point me to a relevant online tutorial on this? But don’t trouble
yourself on this; it’s only icing on the cake.

Thank you very much for the pains you took to get me going again.

Very best wishes,
Richard


#11

On Mar 19, 2009, at 4:42 PM, RichardOnRails wrote:

arr.sort {|a,b| [a[0]<=>b[0], b[1]<=>a[1]].find {|x| x!=0} || 0}
element responds to @- (like any Numeric would, but certainly not

FasterCSV

column
mention its use when chaining comparisons this way.

Hi,

added the line:

Best wishes,
Richard

Well, my guidance was:
arr.sort {|a,b| (a[0] <=> b[0]).nonzero? || b[1] <=> a[1] }

Which translates to your:
sortedMatrix = matrix.sort {|a,b| (a[1]<=>b[1]).nonzero? || a[8]
<=> b[8] }

You might also need:
matrix << row.to_a
or
matrix << row.fields
in your loop, but a FasterCSV::Row probably behaves sufficiently like
an Array to sort properly. Whether it continues to behave later (when
you really need an Array), may resolve the question of whether you
need to call #to_a or #fields on your row.

Real code will always get you a better answer that pseudo-code.

If you meant for either sort on Col.B or Col.I to be descending,
then swap the a and b in the appropriate expression. (Your original
question had the secondary sort descending, but the latest [with
code ;-)] says “col. I asc.”)

-Rob

Rob B. http://agileconsultingllc.com
removed_email_address@domain.invalid


#12

On Mar 19, 8:57 pm, RichardOnRails
removed_email_address@domain.invalid wrote:

Matrix, but it doesn’t seem to address that functionality. Is
On Mar 18, 12:31 pm, Rob B. removed_email_address@domain.invalid

String).

to extract data from a CSV file. I’d like to sort the matrix on
Rob B. http://agileconsultingllc.com
matrix = []
Ruby gave me a syntax error:
Well, my guidance was:
in your loop, but a FasterCSV::Row probably behaves sufficiently like

as you did and been on my way.
[8] <=> b[8] )}

Very best wishes,
Richard

OK Rob,

I Googled "Ruby nonzero? (thus doing essentially what you suggest in
your original response). And I see that when the expression is
processed by nonzero?, the latter returns NOT true or false, but
rather non-zero value or nil (which is almost ‘false’). Cool!

So my extra “v” is documented to be superfluous :slight_smile:

Again, many thanks for your generous and excellent guidance.

Best wishes,
Richard


#13

On Mar 19, 2009, at 9:22 PM, RichardOnRails wrote:

Best wishes,
Richard

Richard,

Glad to have helped.

-Rob

Rob B. http://agileconsultingllc.com
removed_email_address@domain.invalid


#14

On Mar 17, 2009, at 22:02 , RichardOnRails wrote:

I’ve got an array of rows (and thus a matrix) created user FasterCSV
to extract data from a CSV file. I’d like to sort the matrix on
column A ascending and, within that, column B descending. I looked at
Matrix, but it doesn’t seem to address that functionality. Is there
a package that does, or do I have to write my own SuperMatrix
inherited from Matrix?

I’m kinda surprised nobody has said this:

 sorted = matrix.sort_by { |row| [row[0], -row[1]] }

P.S. PLEASE trim to the relevant parts when you reply. In some of your
email on this thread you have two whole copies of nearly the whole
thread.


#15

On Mar 19, 2009, at 20:01 , Rob B. wrote:

Actually, it was suggested, but it assumes that row[1] has a -@
method. The OP confirmed later that the columns are text, not
numbers.

kk. sorry. I made assumptions based on him using Matrix. I couldn’t
see that someone suggested this already through all the noise. :slight_smile:

umm…

class String
def -@
self.gsub(/./) { |s| (?z - s[0] + ?a).chr }
end
end

%w(abc def ghi jhk).sort_by { |s| -s }
=> [“jhk”, “ghi”, “def”, “abc”]

horrible, no?

:smiley:


#16

On Mar 19, 2009, at 10:44 PM, Ryan D. wrote:

I’m kinda surprised nobody has said this:

sorted = matrix.sort_by { |row| [row[0], -row[1]] }

P.S. PLEASE trim to the relevant parts when you reply. In some of
your email on this thread you have two whole copies of nearly the
whole thread.

Actually, it was suggested, but it assumes that row[1] has a -@
method. The OP confirmed later that the columns are text, not numbers.

-Rob

Rob B. http://agileconsultingllc.com
removed_email_address@domain.invalid


#17

On Mar 20, 2009, at 3:27 AM, Ryan D. wrote:

Actually, it was suggested, but it assumes that row[1] has a -@
self.gsub(/./) { |s| (?z - s[0] + ?a).chr }
end
end

%w(abc def ghi jhk).sort_by { |s| -s }
=> [“jhk”, “ghi”, “def”, “abc”]

horrible, no?

:smiley:

Only if you have any characters that aren’t lowercase letters.

class String
def -@
self.gsub(/./) {|s| (255 - s[0]).chr }
end
end

Well, still pretty horrible, though.

irb> - “Ryan D.”
=> “\255\206\236\221\337\273\236\211\226\214”

If you ever meet someone with that name, don’t shake hands; the
universe could come to an end!

-Rob

Rob B. http://agileconsultingllc.com
removed_email_address@domain.invalid


#18

On Mar 20, 2009, at 7:17 AM, Rob B. wrote:

class String
def -@
self.gsub(/./) {|s| (255 - s[0]).chr }
end
end

Oops, it is horrible because it doesn’t reverse sort properly:

irb> %w( a b c d ).sort_by { |s| -s }
=> [“d”, “c”, “b”, “a”]
irb> %w( a b c d ).sort {|a,b| b <=> a }
=> [“d”, “c”, “b”, “a”]

OK when the ordering is based on a letter mismatch, but when one
string is a prefix:

irb> %w( a aa ab bb bbb ).sort {|a,b| b <=> a }
=> [“bbb”, “bb”, “ab”, “aa”, “a”]
irb> %w( a aa ab bb bbb ).sort_by { |s| -s }
=> [“bb”, “bbb”, “a”, “ab”, “aa”]

Perhaps that’s one reason there’s no String#-@

-Rob

Rob B. http://agileconsultingllc.com
removed_email_address@domain.invalid