Sorting decimal index numbers?


#1

Hi there,
I’m working with a list of items that contain decimal-based index
numbers. I’m sorry I don’t know the term for these kinds of numbers, and
I’m having trouble finding out how to sort them properly. To wit:

1.1
1.10
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9

When the index numbers are float or strings, they sort like this. But I
want “1.10” to be the last number. Any idea how to put these guys in the
right order?

Thanks!
Aaron.


#2

On Mar 27, 9:18 pm, Aaron V. removed_email_address@domain.invalid wrote:

When the index numbers are float or strings, they sort like this. But I
want “1.10” to be the last number. Any idea how to put these guys in the
right order?

$ irb
>> nums = ['1.1', '1.2', '1.10', '1.9']
=> ["1.1", "1.2", "1.10", "1.9"]
>> nums.sort
=> ["1.1", "1.10", "1.2", "1.9"]
>> nums.collect { |n|  n.split('.').collect { |i|

i.to_i } }.sort.collect { |n| n.join(’.’) }
=> [“1.1”, “1.2”, “1.9”, “1.10”]


#3

Aaron V. removed_email_address@domain.invalid wrote:

1.5
1.6
1.7
1.8
1.9

When the index numbers are float or strings, they sort like this. But I
want “1.10” to be the last number. Any idea how to put these guys in the
right order?

These are not numbers; they’re strings. Or at least, they need to be; if
they weren’t, 1.1 and 1.10 would be indistinguishable. So let’s assume
they are strings. Then what you’re asking to do is to compare the two
halves of each string as an integer. Take 1.1 and 2.1, for instance;
they compare as 1 and 2 would compare as integers. Very well, now take
1.10 and 1.2, for instance. 1 and 1 are the same, so 10 and 2 need to
compare as the integers 10 and 2. So:

arr = %w{
1.1
1.10
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
}
arr = arr.sort do |x,y|
x1, x2 = x.split(".")
y1, y2 = y.split(".")
if x1 != y1
x1.to_i <=> y1.to_i
else
x2.to_i <=> y2.to_i
end
end
p arr

If there are a lot of these, however, it would be much better to do a
keyed sort using sort_by. How to create keys depends on what you know
about the data. If we knew for a fact that the second half was always
between 1 and 99, it would be easy:

arr = arr.sort_by do |x|
x1, x2 = x.split(".")
x1.to_i * 100 + x2.to_i
end

If the second half can be any length, though, you’ll have to determine
the maximum length first, and change that “100” multiplier accordingly.

m.


#4

Hi Matt,
Are you the Matt N., of AppleScript fame? It’s a privilege to
have an answer from you!

matt neuburg wrote:

These are not numbers; they’re strings. Or at least, they need to be; if
they weren’t, 1.1 and 1.10 would be indistinguishable.

Yes, I think this was becoming my assumption. The head-scratcher for me
was trying to determine if this sort of numbering scheme (“the
hierarchical list”, if you will) was a special case in Ruby or
generally. But in my model right now (this is for a Rails project) it’s
set as a string.

If there are a lot of these, however, it would be much better to do a
keyed sort using sort_by. How to create keys depends on what you know
about the data. If we knew for a fact that the second half was always
between 1 and 99, it would be easy:

arr = arr.sort_by do |x|
x1, x2 = x.split(".")
x1.to_i * 100 + x2.to_i
end

And it turns out that a hundred is more than enough in this case. Thanks
to you, and to Yossef above, for a working solution.

Cheers!
Aaron.


#5

Hi,

At Sat, 28 Mar 2009 13:04:43 +0900,
Matthias R. wrote in [ruby-talk:332318]:

If the second half can be any length, though, you’ll have to determine
the maximum length first, and change that “100” multiplier accordingly.

Either that or use arrays as sort keys:

arr = arr.sort_by do |x|
x.split(".").map {|i| i.to_i}
end

enc/depend file in 1.9 has more generic code.

alphanumeric_order = proc {|e| e.scan(/(\d+)|(\D+)/).map {|n,a|
a||[n.size,n.to_i]}.flatten}

But this fails if strings start with digit and strings start
with non-digit are mixed. In such case, this is safer.

alphanumeric_order = proc {|e|
e.scan(/(\A\D*|\G\D+)(\d+)/).map{|a,n|[a,n.size,n.to_i]}.flatten}


#6

matt neuburg wrote:

If there are a lot of these, however, it would be much better to do a
keyed sort using sort_by. How to create keys depends on what you know
about the data. If we knew for a fact that the second half was always
between 1 and 99, it would be easy:

arr = arr.sort_by do |x|
x1, x2 = x.split(".")
x1.to_i * 100 + x2.to_i
end

If the second half can be any length, though, you’ll have to determine
the maximum length first, and change that “100” multiplier accordingly.

Either that or use arrays as sort keys:

arr = arr.sort_by do |x|
x.split(".").map {|i| i.to_i}
end

-Matthias


#7

On 28.03.2009 05:04, Matthias R. wrote:

If the second half can be any length, though, you’ll have to determine
the maximum length first, and change that “100” multiplier accordingly.

Either that or use arrays as sort keys:

arr = arr.sort_by do |x|
x.split(".").map {|i| i.to_i}
end

Here’s a variant (1.9 only):

arr.sort_by {|x| x.scan(/\d+/).map(&:to_i)}

Cheers

robert


#8

Matthias R. removed_email_address@domain.invalid wrote:

If the second half can be any length, though, you’ll have to determine
the maximum length first, and change that “100” multiplier accordingly.

Either that or use arrays as sort keys:

arr = arr.sort_by do |x|
x.split(".").map {|i| i.to_i}
end

Yes, that’s better; my first answer failed to take into account Array’s
existing implementation of the spaceship operator. m.


#9

On Mar 27, 11:04 pm, Matthias R. removed_email_address@domain.invalid wrote:

arr = arr.sort_by do |x|
x.split(".").map {|i| i.to_i}
end

Durr, of course that’s better than my solution of transforming the
array elements and then re-creating the original ones after the sort.
For some reason, I completely forgot about sort_by.


#10

Aaron V. removed_email_address@domain.invalid wrote:

I’m working with a list of items that contain decimal-based index
numbers. I’m sorry I don’t know the term for these kinds of numbers,
and I’m having trouble finding out how to sort them properly […]

Probably the Dewey Decimal Classification, or the Universal Decimal
Classification.

Chris