Forum: Ruby Sorting decimal index numbers?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Aaron V. (Guest)
on 2009-03-28 04:22
Hi there,
I'm working with a list of items that contain decimal-based index
numbers. I'm sorry I don't know the term for these kinds of numbers, and
I'm having trouble finding out how to sort them properly. To wit:

1.1
1.10
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9

When the index numbers are float or strings, they sort like this. But I
want "1.10" to be the last number. Any idea how to put these guys in the
right order?

Thanks!
Aaron.
Yossef M. (Guest)
on 2009-03-28 04:35
(Received via mailing list)
On Mar 27, 9:18 pm, Aaron V. <removed_email_address@domain.invalid> wrote:
> When the index numbers are float or strings, they sort like this. But I
> want "1.10" to be the last number. Any idea how to put these guys in the
> right order?

    $ irb
    >> nums = ['1.1', '1.2', '1.10', '1.9']
    => ["1.1", "1.2", "1.10", "1.9"]
    >> nums.sort
    => ["1.1", "1.10", "1.2", "1.9"]
    >> nums.collect { |n|  n.split('.').collect { |i|
i.to_i } }.sort.collect { |n|  n.join('.') }
    => ["1.1", "1.2", "1.9", "1.10"]
matt neuburg (Guest)
on 2009-03-28 04:50
(Received via mailing list)
Aaron V. <removed_email_address@domain.invalid> wrote:

> 1.5
> 1.6
> 1.7
> 1.8
> 1.9
>
> When the index numbers are float or strings, they sort like this. But I
> want "1.10" to be the last number. Any idea how to put these guys in the
> right order?

These are not numbers; they're strings. Or at least, they need to be; if
they weren't, 1.1 and 1.10 would be indistinguishable. So let's assume
they *are* strings. Then what you're asking to do is to compare the two
halves of each string as an integer. Take 1.1 and 2.1, for instance;
they compare as 1 and 2 would compare as integers. Very well, now take
1.10 and 1.2, for instance. 1 and 1 are the same, so 10 and 2 need to
compare as the integers 10 and 2. So:

arr = %w{
1.1
1.10
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
}
arr = arr.sort do |x,y|
  x1, x2 = x.split(".")
  y1, y2 = y.split(".")
  if x1 != y1
    x1.to_i <=> y1.to_i
  else
    x2.to_i <=> y2.to_i
  end
end
p arr

If there are a lot of these, however, it would be much better to do a
keyed sort using sort_by. How to create keys depends on what you know
about the data. If we knew for a fact that the second half was always
between 1 and 99, it would be easy:

arr = arr.sort_by do |x|
  x1, x2 = x.split(".")
  x1.to_i * 100 + x2.to_i
end

If the second half can be any length, though, you'll have to determine
the maximum length first, and change that "100" multiplier accordingly.

m.
Aaron V. (Guest)
on 2009-03-28 05:01
Hi Matt,
Are you _the_ Matt N., of AppleScript fame? It's a privilege to
have an answer from you!

matt neuburg wrote:

> These are not numbers; they're strings. Or at least, they need to be; if
> they weren't, 1.1 and 1.10 would be indistinguishable.

Yes, I think this was becoming my assumption. The head-scratcher for me
was trying to determine if this sort of numbering scheme ("the
hierarchical list", if you will) was a special case in Ruby or
generally. But in my model right now (this is for a Rails project) it's
set as a string.

> If there are a lot of these, however, it would be much better to do a
> keyed sort using sort_by. How to create keys depends on what you know
> about the data. If we knew for a fact that the second half was always
> between 1 and 99, it would be easy:
>
> arr = arr.sort_by do |x|
>   x1, x2 = x.split(".")
>   x1.to_i * 100 + x2.to_i
> end

And it turns out that a hundred is more than enough in this case. Thanks
to you, and to Yossef above, for a working solution.

Cheers!
Aaron.
Matthias R. (Guest)
on 2009-03-28 06:08
matt neuburg wrote:
> If there are a lot of these, however, it would be much better to do a
> keyed sort using sort_by. How to create keys depends on what you know
> about the data. If we knew for a fact that the second half was always
> between 1 and 99, it would be easy:
>
> arr = arr.sort_by do |x|
>   x1, x2 = x.split(".")
>   x1.to_i * 100 + x2.to_i
> end
>
> If the second half can be any length, though, you'll have to determine
> the maximum length first, and change that "100" multiplier accordingly.

Either that or use arrays as sort keys:

  arr = arr.sort_by do |x|
    x.split(".").map {|i| i.to_i}
  end

-Matthias
Nobuyoshi N. (Guest)
on 2009-03-28 07:34
(Received via mailing list)
Hi,

At Sat, 28 Mar 2009 13:04:43 +0900,
Matthias R. wrote in [ruby-talk:332318]:
> > If the second half can be any length, though, you'll have to determine
> > the maximum length first, and change that "100" multiplier accordingly.
>
> Either that or use arrays as sort keys:
>
>   arr = arr.sort_by do |x|
>     x.split(".").map {|i| i.to_i}
>   end

enc/depend file in 1.9 has more generic code.

  alphanumeric_order = proc {|e| e.scan(/(\d+)|(\D+)/).map {|n,a|
a||[n.size,n.to_i]}.flatten}

But this fails if strings start with digit and strings start
with non-digit are mixed.  In such case, this is safer.

  alphanumeric_order = proc {|e|
e.scan(/(\A\D*|\G\D+)(\d+)/).map{|a,n|[a,n.size,n.to_i]}.flatten}
Robert K. (Guest)
on 2009-03-28 12:10
(Received via mailing list)
On 28.03.2009 05:04, Matthias R. wrote:
>>
>> If the second half can be any length, though, you'll have to determine
>> the maximum length first, and change that "100" multiplier accordingly.
>
> Either that or use arrays as sort keys:
>
>   arr = arr.sort_by do |x|
>     x.split(".").map {|i| i.to_i}
>   end

Here's a variant (1.9 only):

arr.sort_by {|x| x.scan(/\d+/).map(&:to_i)}

Cheers

  robert
Chris D. (Guest)
on 2009-03-28 14:06
(Received via mailing list)
Aaron V. <removed_email_address@domain.invalid> wrote:
> I'm working with a list of items that contain decimal-based index
> numbers. I'm sorry I don't know the term for these kinds of numbers,
> and I'm having trouble finding out how to sort them properly [...]

Probably the Dewey Decimal Classification, or the Universal Decimal
Classification.

Chris
matt neuburg (Guest)
on 2009-03-28 18:40
(Received via mailing list)
Matthias R. <removed_email_address@domain.invalid> wrote:

> >
> > If the second half can be any length, though, you'll have to determine
> > the maximum length first, and change that "100" multiplier accordingly.
>
> Either that or use arrays as sort keys:
>
>   arr = arr.sort_by do |x|
>     x.split(".").map {|i| i.to_i}
>   end

Yes, that's better; my first answer failed to take into account Array's
existing implementation of the spaceship operator. m.
Yossef M. (Guest)
on 2009-03-29 00:24
(Received via mailing list)
On Mar 27, 11:04 pm, Matthias R. <removed_email_address@domain.invalid> wrote:
>   arr = arr.sort_by do |x|
>     x.split(".").map {|i| i.to_i}
>   end

Durr, of course that's better than my solution of transforming the
array elements and then re-creating the original ones after the sort.
For some reason, I completely forgot about sort_by.
This topic is locked and can not be replied to.