Forum: Ruby how to sort this array

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
73700e119917433681f2e8f3e4369f74?d=identicon&s=25 Li CN (alex-osu3)
on 2008-10-30 22:15
Hi,

I have an array containing files names. How do I sort them so that I can
get the expected results?

Thanks,

Li

#############
files=[
  "c:/ruby/self/2004/20.txt",
  "c:/ruby/self/2004/3.txt",
  "c:/ruby/self/2004/2.txt",
  "c:/ruby/self/2004/10.txt",
  "c:/ruby/self/2004/1.txt"
]

expected results:
[
"c:/ruby/self/2004/1.txt",
"c:/ruby/self/2004/2.txt",
"c:/ruby/self/2004/3.txt",
"c:/ruby/self/2004/10.txt",
]
"c:/ruby/self/2004/20.txt",
Cf7cd97cdc8ed7d4ae92965b24f0dfad?d=identicon&s=25 Stefan Rusterholz (apeiros)
on 2008-10-30 22:34
Li Chen wrote:
> Hi,
>
> I have an array containing files names. How do I sort them so that I can
> get the expected results?
>
> Thanks,
>
> Li
>
> #############
> files=[
>   "c:/ruby/self/2004/20.txt",
>   "c:/ruby/self/2004/3.txt",
>   "c:/ruby/self/2004/2.txt",
>   "c:/ruby/self/2004/10.txt",
>   "c:/ruby/self/2004/1.txt"
> ]
>
> expected results:
> [
> "c:/ruby/self/2004/1.txt",
> "c:/ruby/self/2004/2.txt",
> "c:/ruby/self/2004/3.txt",
> "c:/ruby/self/2004/10.txt",
> ]
> "c:/ruby/self/2004/20.txt",

You get that result because in "2" <=> "10" the comparison is byte by
byte and then "2" is > "1", so it aborts there with "2" being > than
"10".
What you want is called natural sorting. Googling for natsort and ruby
you should get a few results.

Regards
Stefan
73700e119917433681f2e8f3e4369f74?d=identicon&s=25 Li CN (alex-osu3)
on 2008-10-30 22:44
here is my code:

C:\Users\Alex>irb
irb(main):001:0>  files=[
irb(main):002:1*    "c:/ruby/self/2004/20.txt",
irb(main):003:1*    "c:/ruby/self/2004/3.txt",
irb(main):004:1*    "c:/ruby/self/2004/2.txt",
irb(main):005:1*    "c:/ruby/self/2004/10.txt",
irb(main):006:1*    "c:/ruby/self/2004/1.txt"
irb(main):007:1> ]
=> ["c:/ruby/self/2004/20.txt", "c:/ruby/self/2004/3.txt",
"c:/ruby/self/2004/2.txt", "c:/ruby/self/
2004/10.txt", "c:/ruby/self/2004/1.txt"]
irb(main):008:0>
irb(main):009:0*
irb(main):010:0* files=files.sort_by do|s|
irb(main):011:1* s.split(/\//)[-1].split(/\./)[0].to_i
irb(main):012:1> end
=> ["c:/ruby/self/2004/1.txt", "c:/ruby/self/2004/2.txt",
"c:/ruby/self/2004/3.txt", "c:/ruby/self/2
004/10.txt", "c:/ruby/self/2004/20.txt"]
irb(main):013:0>

I am not sure if it is the Rubyish way.

Li
Bf6862e2a409078e13a3979c00bba1d6?d=identicon&s=25 Gregory Seidman (Guest)
on 2008-10-30 23:32
(Received via mailing list)
On Fri, Oct 31, 2008 at 06:14:51AM +0900, Li Chen wrote:
> files=[
> "c:/ruby/self/2004/2.txt",
> "c:/ruby/self/2004/3.txt",
> "c:/ruby/self/2004/10.txt",
> ]
> "c:/ruby/self/2004/20.txt",

files = files.sort_by { |f| f[/\/(\d+)\.[^\/]*\Z/, 1].to_i }

There are several things to explain here:

- The #sort_by method calls the block on each element to produce a set
of
  surrogate values to use as sorting keys.

- The regular expression captures the series of digits between the last
  slash in a string and the following dot.

- The [] method can be called on String in many ways, including passing
a
  RegExp and an index. That form returns the capture of the appropriate
  index from matching the RegExp or nil if the RegExp does not match.

- Calling #to_i on the result means that it will either parse the
captured
  sequence of digits as an integer or, if the RegExp fails, return 0.

--Greg
1d53b088a989e069b94597c282eebbbc?d=identicon&s=25 Simon Krahnke (Guest)
on 2008-10-30 23:35
(Received via mailing list)
* Li Chen <chen_li3@yahoo.com> (22:14) schrieb:

> "c:/ruby/self/2004/1.txt",
> "c:/ruby/self/2004/2.txt",
> "c:/ruby/self/2004/3.txt",
> "c:/ruby/self/2004/10.txt",
> "c:/ruby/self/2004/20.txt",
> ]

files.sort_by { | fn | fn.match(/(\d+)\.txt$/)[1].to_i }

mfg,                   simon .... hth
Ef3aa7f7e577ea8cd620462724ddf73b?d=identicon&s=25 Rob Biedenharn (Guest)
on 2008-10-31 02:24
(Received via mailing list)
On Oct 30, 2008, at 6:34 PM, Simon Krahnke wrote:
>> expected results:
> mfg,                   simon .... hth
Since #to_i will stop at a non-digit, you could get simpler:

files.sort_by {|fn| fn[%r{.*/([^/]+)},1].to_i }

I tend to use %r{ } when the Regexp deals with / characters.

-Rob

Rob Biedenharn    http://agileconsultingllc.com
Rob@AgileConsultingLLC.com
797ef431a5e1295b56c08e1db4c8d2df?d=identicon&s=25 botp (Guest)
on 2008-10-31 05:02
(Received via mailing list)
On Fri, Oct 31, 2008 at 5:14 AM, Li Chen <chen_li3@yahoo.com> wrote:
> expected results:
> [ "c:/ruby/self/2004/1.txt",
> "c:/ruby/self/2004/2.txt",
> "c:/ruby/self/2004/3.txt",
> "c:/ruby/self/2004/10.txt",
>  "c:/ruby/self/2004/20.txt",
>]

apology in advance for dupl email. i sent an email an hrs back but it
seems it got blackholed :)

anyway, if you want basename numeric sorting, then just do

eg,

   files.sort_by{|f| File.basename(f).to_i}
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (Guest)
on 2008-10-31 09:13
(Received via mailing list)
2008/10/31 botp <botpena@gmail.com>:
> seems it got blackholed :)
>
> anyway, if you want basename numeric sorting, then just do
>
> eg,
>
>   files.sort_by{|f| File.basename(f).to_i}

Very elegant!  Here's another variant

irb(main):013:0> puts files.sort_by {|f| f[/\d+(?=\.txt$)/].to_i}
c:/ruby/self/2004/1.txt
c:/ruby/self/2004/2.txt
c:/ruby/self/2004/3.txt
c:/ruby/self/2004/10.txt
c:/ruby/self/2004/20.txt
=> nil

Kind regards

robert
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (Guest)
on 2008-10-31 09:15
(Received via mailing list)
And here's a variant for cases where there are multiple subdirectories:

irb(main):010:0> puts files.sort_by {|f| f.scan(/\d+/).map{|x|x.to_i}}
c:/ruby/self/2004/1.txt
c:/ruby/self/2004/2.txt
c:/ruby/self/2004/3.txt
c:/ruby/self/2004/10.txt
c:/ruby/self/2004/20.txt

Kind regards

robert
This topic is locked and can not be replied to.