Forum: Ruby Sorting decimal index numbers?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
D3e75d0e66131200a466ba77f2e3a7f7?d=identicon&s=25 Aaron Vegh (aaronvegh)
on 2009-03-28 03:22
Hi there,
I'm working with a list of items that contain decimal-based index
numbers. I'm sorry I don't know the term for these kinds of numbers, and
I'm having trouble finding out how to sort them properly. To wit:

1.1
1.10
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9

When the index numbers are float or strings, they sort like this. But I
want "1.10" to be the last number. Any idea how to put these guys in the
right order?

Thanks!
Aaron.
B1b1d33e0655e841d4fd8467359c58d0?d=identicon&s=25 Yossef Mendelssohn (Guest)
on 2009-03-28 03:35
(Received via mailing list)
On Mar 27, 9:18 pm, Aaron Vegh <aa...@vegh.ca> wrote:
> When the index numbers are float or strings, they sort like this. But I
> want "1.10" to be the last number. Any idea how to put these guys in the
> right order?

    $ irb
    >> nums = ['1.1', '1.2', '1.10', '1.9']
    => ["1.1", "1.2", "1.10", "1.9"]
    >> nums.sort
    => ["1.1", "1.10", "1.2", "1.9"]
    >> nums.collect { |n|  n.split('.').collect { |i|
i.to_i } }.sort.collect { |n|  n.join('.') }
    => ["1.1", "1.2", "1.9", "1.10"]
Fbb4d027695dfdf76bf448b15d7e306a?d=identicon&s=25 matt neuburg (Guest)
on 2009-03-28 03:50
(Received via mailing list)
Aaron Vegh <aaron@vegh.ca> wrote:

> 1.5
> 1.6
> 1.7
> 1.8
> 1.9
>
> When the index numbers are float or strings, they sort like this. But I
> want "1.10" to be the last number. Any idea how to put these guys in the
> right order?

These are not numbers; they're strings. Or at least, they need to be; if
they weren't, 1.1 and 1.10 would be indistinguishable. So let's assume
they *are* strings. Then what you're asking to do is to compare the two
halves of each string as an integer. Take 1.1 and 2.1, for instance;
they compare as 1 and 2 would compare as integers. Very well, now take
1.10 and 1.2, for instance. 1 and 1 are the same, so 10 and 2 need to
compare as the integers 10 and 2. So:

arr = %w{
1.1
1.10
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
}
arr = arr.sort do |x,y|
  x1, x2 = x.split(".")
  y1, y2 = y.split(".")
  if x1 != y1
    x1.to_i <=> y1.to_i
  else
    x2.to_i <=> y2.to_i
  end
end
p arr

If there are a lot of these, however, it would be much better to do a
keyed sort using sort_by. How to create keys depends on what you know
about the data. If we knew for a fact that the second half was always
between 1 and 99, it would be easy:

arr = arr.sort_by do |x|
  x1, x2 = x.split(".")
  x1.to_i * 100 + x2.to_i
end

If the second half can be any length, though, you'll have to determine
the maximum length first, and change that "100" multiplier accordingly.

m.
D3e75d0e66131200a466ba77f2e3a7f7?d=identicon&s=25 Aaron Vegh (aaronvegh)
on 2009-03-28 04:01
Hi Matt,
Are you _the_ Matt Neuburg, of AppleScript fame? It's a privilege to
have an answer from you!

matt neuburg wrote:

> These are not numbers; they're strings. Or at least, they need to be; if
> they weren't, 1.1 and 1.10 would be indistinguishable.

Yes, I think this was becoming my assumption. The head-scratcher for me
was trying to determine if this sort of numbering scheme ("the
hierarchical list", if you will) was a special case in Ruby or
generally. But in my model right now (this is for a Rails project) it's
set as a string.

> If there are a lot of these, however, it would be much better to do a
> keyed sort using sort_by. How to create keys depends on what you know
> about the data. If we knew for a fact that the second half was always
> between 1 and 99, it would be easy:
>
> arr = arr.sort_by do |x|
>   x1, x2 = x.split(".")
>   x1.to_i * 100 + x2.to_i
> end

And it turns out that a hundred is more than enough in this case. Thanks
to you, and to Yossef above, for a working solution.

Cheers!
Aaron.
09348009e57e24e10bbc08d925bf69ca?d=identicon&s=25 Matthias Reitinger (reima)
on 2009-03-28 05:08
matt neuburg wrote:
> If there are a lot of these, however, it would be much better to do a
> keyed sort using sort_by. How to create keys depends on what you know
> about the data. If we knew for a fact that the second half was always
> between 1 and 99, it would be easy:
>
> arr = arr.sort_by do |x|
>   x1, x2 = x.split(".")
>   x1.to_i * 100 + x2.to_i
> end
>
> If the second half can be any length, though, you'll have to determine
> the maximum length first, and change that "100" multiplier accordingly.

Either that or use arrays as sort keys:

  arr = arr.sort_by do |x|
    x.split(".").map {|i| i.to_i}
  end

-Matthias
F1d6cc2b735bfd82c8773172da2aeab9?d=identicon&s=25 Nobuyoshi Nakada (nobu)
on 2009-03-28 06:34
(Received via mailing list)
Hi,

At Sat, 28 Mar 2009 13:04:43 +0900,
Matthias Reitinger wrote in [ruby-talk:332318]:
> > If the second half can be any length, though, you'll have to determine
> > the maximum length first, and change that "100" multiplier accordingly.
>
> Either that or use arrays as sort keys:
>
>   arr = arr.sort_by do |x|
>     x.split(".").map {|i| i.to_i}
>   end

enc/depend file in 1.9 has more generic code.

  alphanumeric_order = proc {|e| e.scan(/(\d+)|(\D+)/).map {|n,a|
a||[n.size,n.to_i]}.flatten}

But this fails if strings start with digit and strings start
with non-digit are mixed.  In such case, this is safer.

  alphanumeric_order = proc {|e|
e.scan(/(\A\D*|\G\D+)(\d+)/).map{|a,n|[a,n.size,n.to_i]}.flatten}
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (Guest)
on 2009-03-28 11:10
(Received via mailing list)
On 28.03.2009 05:04, Matthias Reitinger wrote:
>>
>> If the second half can be any length, though, you'll have to determine
>> the maximum length first, and change that "100" multiplier accordingly.
>
> Either that or use arrays as sort keys:
>
>   arr = arr.sort_by do |x|
>     x.split(".").map {|i| i.to_i}
>   end

Here's a variant (1.9 only):

arr.sort_by {|x| x.scan(/\d+/).map(&:to_i)}

Cheers

  robert
E83a32d0cd5c03099421dcb75926efce?d=identicon&s=25 Chris Davies (Guest)
on 2009-03-28 13:06
(Received via mailing list)
Aaron Vegh <aaron@vegh.ca> wrote:
> I'm working with a list of items that contain decimal-based index
> numbers. I'm sorry I don't know the term for these kinds of numbers,
> and I'm having trouble finding out how to sort them properly [...]

Probably the Dewey Decimal Classification, or the Universal Decimal
Classification.

Chris
Fbb4d027695dfdf76bf448b15d7e306a?d=identicon&s=25 matt neuburg (Guest)
on 2009-03-28 17:40
(Received via mailing list)
Matthias Reitinger <reitinge@in.tum.de> wrote:

> >
> > If the second half can be any length, though, you'll have to determine
> > the maximum length first, and change that "100" multiplier accordingly.
>
> Either that or use arrays as sort keys:
>
>   arr = arr.sort_by do |x|
>     x.split(".").map {|i| i.to_i}
>   end

Yes, that's better; my first answer failed to take into account Array's
existing implementation of the spaceship operator. m.
B1b1d33e0655e841d4fd8467359c58d0?d=identicon&s=25 Yossef Mendelssohn (Guest)
on 2009-03-28 23:24
(Received via mailing list)
On Mar 27, 11:04 pm, Matthias Reitinger <reiti...@in.tum.de> wrote:
>   arr = arr.sort_by do |x|
>     x.split(".").map {|i| i.to_i}
>   end

Durr, of course that's better than my solution of transforming the
array elements and then re-creating the original ones after the sort.
For some reason, I completely forgot about sort_by.
This topic is locked and can not be replied to.