Top 10 last played mp3's

Hello everyone,

My level of ruby-experience has just slightly passed hello-world, I’m
quite new to this programming language; I hope I don’t bother all of you
too much by posting my low-level question here.

I would like to make a program that iterates recursively trough my music
directory, and selects the top-10 of most recently played tracks.

I try to use File.atime(path) to check if the current file is accessed
recently enough to get into my top-10, but my problem is that I don’t
know how to (correctly) compare a date to the return value of
File.atime.

I’m sure you can give me some simple hints, or a small push in the right
direction. Thanks in advance.

this is what I have so far:

require ‘find’
require ‘date’

dir = “/share/music”

Find.find(dir) do |path|

if File.directory?(path)
next
else
if path.match(".*mp3") != nil
print File.atime(path)," : "
print path,"\n"
end
end

end

Robin Wagenaar wrote:

My level of ruby-experience has just slightly passed hello-world, I’m
quite new to this programming language; I hope I don’t bother all of you
too much by posting my low-level question here.

I would like to make a program that iterates recursively trough my music
directory, and selects the top-10 of most recently played tracks.

I try to use File.atime(path) to check if the current file is accessed
recently enough to get into my top-10, but my problem is that I don’t
know how to (correctly) compare a date to the return value of
File.atime.

File.atime returns a Time object, you can compare it to another Time
object, for example:

if File.atime(path) > (Time.now - 60*60)

file has been accessed in the last hour

end

But you can also solve the whole problem much easier:

Dir[’/share/music/**/.mp3’, '/share/music/.mp3’].sort_by{|f|
File.atime}.reverse[0,10]

Quoth Andreas S.:

know how to (correctly) compare a date to the return value of

Dir[’/share/music/**/.mp3’, '/share/music/.mp3’].sort_by{|f|
File.atime}.reverse[0,10]

Or, if it’s deeper than two levels:

require ‘find’

Find.find("/share/music/").sort_by { |f| f.atime }.reverse[0…10]

I’m glad you (the OP) are interested in learning more about ruby.

Regards,

Robin Wagenaar wrote:

recently enough to get into my top-10, but my problem is that I don’t
require ‘find’
print File.atime(path)," : "
print path,"\n"
end
end

end

Your original question has been answered already, but just a small note:
puts and string interpolation is generally a lot simpler to use for
output:

puts “#{File.atime(path)} : #{path}”

That will also add the “\n” or “\r\n” as appropriate, so you do not need
to.

-Justin

Andreas S. wrote:

Dir[’/share/music/**/.mp3’, '/share/music/.mp3’].sort_by{|f|
File.atime}.reverse[0,10]

this one seems to work

a = Dir[‘E:/Music/*.mp3’].sort_by{|f| File.atime(f)}.reverse[0,10]
a.each {|f| p f, File.atime(f)}

except on Windows Vista, I noticed 2 things:

  1. The filenames with internation characters come out as ???.mp3
    and therefore File.atime(f) will fails afterwards.

  2. The access time of .mp3 or .txt or .txt is unchanged even after the
    file is played, or run (such as by ruby test_dir.rb), or looked at by
    Notepad.

Konrad M. wrote:

Quoth Andreas S.:

know how to (correctly) compare a date to the return value of

Dir[’/share/music/**/.mp3’, '/share/music/.mp3’].sort_by{|f|
File.atime}.reverse[0,10]

Or, if it’s deeper than two levels:

‘**’ takes care of that.

On Oct 13, 7:06 pm, Konrad M. [email protected] wrote:

require ‘find’

Find.find(“/share/music/”).sort_by { |f| f.atime }.reverse[0…10]

Doesn’t Find.find() require a block? And doesn’t it pass a string to
the block, not a File ?

Hmm… it just occurred to me that many of the solutions presented here
have the flaw of potentially calling File.atime() multiple times for
the same file which would require unnecessary calls to the operating
system to get the access time of the file.

Better to compute atime only once and then sort on the result. In the
case below, I then extract the file name back out of the resulting
array, but in actuality, there may be some value in providing that to
the caller in case they could make use of the access time values.

require ‘pp’
require ‘find’

def n_recent_files n = 1, path = ‘.’, exts = nil
paths = []
Find.find(path) do |p|
if File.file?§ &&
( exts.nil? ||
( (ext = File.extname§) &&
!ext.empty? &&
exts.include?(ext) ) )
paths << [ p, File.atime§.to_i ]
end
end
paths.sort! {|a,b| b[1] <=> a[1] }[0,n].map {|x| x[0]}
end

pp n_recent_files(10, ‘/home/brian/temp’, ‘.rb’)
pp n_recent_files(10, ‘/home/brian/temp’, [’.rb’, ‘.txt’])

Quoth Brian A.:

On Oct 13, 7:06 pm, Konrad M. [email protected] wrote:

require ‘find’

Find.find(“/share/music/”).sort_by { |f| f.atime }.reverse[0…10]

Doesn’t Find.find() require a block? And doesn’t it pass a string to
the block, not a File ?

Sorry, don’t know anything about it, I was just going by the usage put
forth
by the guy in front of me.

First of all: thank you all for your amazing amount of replies (in one
night!) and you warm welcome! You’ve definitely helped me solve my
problem, and made me view the problem from a different angle.

Top to bottom:

@Andreas S:
Your solution would indeed be quite a lot simpler than mine! Allthough
it might not be the most efficient, I will also implement your code,
just because it looks so darn easy! Thanks again!

@Spring Flowers:
Thanks for your concern, I don’t know anything about Vista-behaviour. My
OS is Ubuntu, and when rhythmbox runs/opens a file, it’s atime is set
=).

@Justin:
Thanks a lot for your string interpolation and puts idea! That really
makes life alot easier!

@Brian:
Whow, you really made my day. That was exactly what I was looking for!
Thanks a lot for the effort, code, and useful explanation! I don’t
really know what to say. Thanks!

On Oct 13, 8:35 pm, “Andreas S.” [email protected]
wrote:

Konrad M. wrote:

Quoth Andreas S.:

know how to (correctly) compare a date to the return value of

Dir[‘/share/music/**/.mp3’, '/share/music/.mp3’].sort_by{|f|
File.atime}.reverse[0,10]

Or, if it’s deeper than two levels:

‘**’ takes care of that.

Ah, you’re right! I missed that also (even though the ** line is
highlighted in the pickaxe now that I bother to read Dir#glob :slight_smile: ).

Before reading your explanation, I put the following together. It
might be a little friendlier for handling multiple extensions I
suppose, especially since the glob pattern isn’t a real regex.

require ‘pp’
require ‘find’

def n_recent_files n = 1, path = ‘.’, exts = nil
paths = []
Find.find(path) do |p|
if File.file?(p) &&
( exts.nil? ||
( (ext = File.extname(p)) &&
!ext.empty? &&
exts.include?(ext) ) )
paths << p
end
end
paths.sort! {|a,b| File.atime(b) <=> File.atime(a) }[0,n]
end

pp n_recent_files(10, ‘/home/brian/temp’, ‘.mp3’)
pp n_recent_files(10, ‘/home/brian/temp’, [‘.mp3’, ‘.ogg’])

Some notes for the OP:

  1. Welcome to Ruby!
  2. Enumerable#sort_by can be slow in some cases, and by using sort (or
    sort! in this case), the comparison can be reversed to avoid calling
    Array#reverse on the entire array later.
  3. It exhibits a benefit of ‘duck typing’ since both String and Array
    define include? the caller can pass in a single string or an array of
    strings for the extension parameter w/ no extra effort in the
    function.
  4. Although the if expression is complicated, it has the advantage of
    only computing the file extension when necessary.

Brian

Brian A. wrote:

On Oct 13, 8:35 pm, “Andreas S.” [email protected]
wrote:

Konrad M. wrote:

Quoth Andreas S.:

know how to (correctly) compare a date to the return value of

Dir[‘/share/music/**/.mp3’, '/share/music/.mp3’].sort_by{|f|
File.atime}.reverse[0,10]

Or, if it’s deeper than two levels:

‘**’ takes care of that.

Ah, you’re right! I missed that also (even though the ** line is
highlighted in the pickaxe now that I bother to read Dir#glob :slight_smile: ).

Before reading your explanation, I put the following together. It
might be a little friendlier for handling multiple extensions I

glob can do that easier, too:
Dir[‘/share/music/**/*.{mp3,m4p}’].sort_by{|f| File.atime
f}.reverse[0,10]

I’d be careful with all these optimizations you are suggesting. By far
the slowest part is the recursive traversal of the directory, and you
can’t speed that up. Array#reverse is in a completely different league
and not worth optimizing if you have to sacrifice readability. The
File.atime calls are pretty fast, too (100.000 per second on my old
powerbook).

On Oct 13, 7:50 pm, Brian A. [email protected] wrote:

Hmm… it just occurred to me that many of the solutions presented here
have the flaw of potentially calling File.atime() multiple times for
the same file which would require unnecessary calls to the operating
system to get the access time of the file.

Really? Which ones? You do realize that #sort_by is explicitly
designed to call the comparison method exactly once for each object,
right?

From the ri docs themselves:

“A more efficient technique is to cache the sort keys (modification
times in this case) before the sort. Perl users often call this
approach a Schwartzian Transform, after Randal Schwartz. We construct
a temporary array, where each element is an array containing our sort
key along with the filename. We sort this array, and then extract the
filename from the result.”

My understanding is that with a directory containing 5,000 MP3s, this
solution:

Dir[‘/share/music/**/.mp3’, '/share/music/.mp3’].sort_by{ |f|
File.atime(f)
}.reverse[0,10]

will call File.atime exactly 5,000 times and create exactly 5,000 Time
instances.

Am 14 Oct 2007 um 23:55 hat Phrogz geschrieben:

Dir[’/share/music/**/.mp3’, '/share/music/.mp3’]

The above has a flaw.
Dir[’/share/music/**/*.mp3’] already includes the .mp3 in
'/share/music/ itself, so when you add '/share/music/
.mp3’ in Dir.glob
you add the *.mp3 from that directory again with the result that you
have them two times in your array:

aaa/
aaa/x.mp3
aaa/bbb/
aaa/bbb/y.mp3
aaa/bbb/ccc
aaa/bbb/ccc/z.mp3

p Dir[‘aaa/**/*.mp3’].sort
#=> [“aaa/x.mp3”, “aaa/bbb/y.mp3”, “aaa/bbb/ccc/z.mp3”]

p Dir[‘aaa/**/.mp3’,'aaa/.mp3’].sort
#=> [“aaa/x.mp3”, “aaa/x.mp3”,
“aaa/bbb/y.imp3”, “aaa/bbb/ccc/z.mp3”]

Dirk T.

Gavin K. wrote:

My understanding is that with a directory containing 5,000 MP3s, this
solution:

Dir[’/share/music/**/.mp3’, '/share/music/.mp3’].sort_by{ |f|
File.atime(f)
}.reverse[0,10]

will call File.atime exactly 5,000 times and create exactly 5,000 Time
instances.

won’t it call File.atime(f) (c * n log n) times?
n log n is the big O… O(n log n)… and then c is the constant
depending on the sort algorithm.

On 10/14/07, SpringFlowers AutumnMoon [email protected] wrote:

instances.

won’t it call File.atime(f) (c * n log n) times?
n log n is the big O… O(n log n)… and then c is the constant
depending on the sort algorithm.

No, sort_by builds a parallel array with the value of each element in
the original collection and uses that array for the sort values.


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

On Oct 14, 7:03 am, “Andreas S.” [email protected]
wrote:

Brian A. wrote:

On Oct 13, 8:35 pm, “Andreas S.” [email protected] wrote:
I’d be careful with all these optimizations you are suggesting. By far
the slowest part is the recursive traversal of the directory, and you
can’t speed that up. Array#reverse is in a completely different league
and not worth optimizing if you have to sacrifice readability. The
File.atime calls are pretty fast, too (100.000 per second on my old
powerbook).

You’re correct, my bad. Thanks for bringing that to my attention.

  1. The speed difference between Dir (good) and Find (terrible) totally
    dominates. Never underestimate the slowness of Ruby code compared to C
    code running in the interpreter :slight_smile:

  2. Phrogz is correct regarding Enumerable#sort_by. Since it builds an
    array of tuples first, File.atime is only called once per path. I was
    influenced by misapplying the warning in the pickaxe, but in this
    case, sort_by seems warranted since I was basically doing the same
    thing (building an array of tuples with the sort value) manually - but
    in Ruby instead of C! In fact, if I had bothered to turn the page, the
    example they give is strangely relevant!

  3. Array#reverse is just noise in the profile below, so I should be
    more careful about avoiding it.

  4. Sorry Robin, your praise was premature :slight_smile:

On minor point; I think you may be mistaken regarding the slowest part
being the directory traversal (at least in your code). Both the
sorting and time comparison are much greater:

brian@imagine:~/sync/code/ruby$ ruby -r profile andreas.rb
% cumulative self self total
time seconds seconds calls ms/call ms/call name
43.42 1.32 1.32 2 660.00 1385.00
Enumerable.sort_by
28.95 2.20 0.88 26958 0.03 0.03 Time#<=>
12.50 2.58 0.38 4 95.00 142.50 Array#each
7.89 2.82 0.24 2 120.00 120.00 Dir#[]
6.25 3.01 0.19 5220 0.04 0.04 File#atime
0.99 3.04 0.03 2 15.00 25.00 Kernel.require
0.00 3.04 0.00 10 0.00 0.00
Module#class_eval
0.00 3.04 0.00 9 0.00 0.00
Kernel.singleton_method_added
0.00 3.04 0.00 3 0.00 0.00 Module#included
0.00 3.04 0.00 1 0.00 0.00
Module#module_function
0.00 3.04 0.00 8 0.00 0.00 Class#inherited
0.00 3.04 0.00 77 0.00 0.00
Module#method_added
0.00 3.04 0.00 3 0.00 0.00 Module#include
0.00 3.04 0.00 1 0.00 3010.00 Integer#times
0.00 3.04 0.00 1 0.00 0.00
Module#attr_accessor
0.00 3.04 0.00 3 0.00 0.00
Module#append_features
0.00 3.04 0.00 1 0.00 0.00 Module#private
0.00 3.04 0.00 5 0.00 0.00
Module#attr_reader
0.00 3.04 0.00 2 0.00 0.00 Array#[]
0.00 3.04 0.00 2 0.00 0.00 String#==
0.00 3.04 0.00 2 0.00 0.00 Array#reverse
-0.00 3.04 -0.00 2 -0.00 1505.00
Object#n_recent_files
0.00 3.04 0.00 1 0.00 3040.00 #toplevel

You were correct in my case though since I used the Find library:

% cumulative self self total
time seconds seconds calls ms/call ms/call name
34.05 45.03 45.03 57436 0.78 2.08 Kernel.catch
23.39 75.96 30.93 16019 1.93 2.79 Dir#each
5.66 83.44 7.48 16019 0.47 0.47 Dir#open
5.34 90.50 7.06 220364 0.03 0.03 String#==
5.27 97.47 6.97 1 6970.00 128230.00 Find.find
3.71 102.38 4.91 57437 0.09 0.13 Kernel.dup
1.87 104.85 2.47 1 2470.00 3820.00 Array#sort!
1.79 107.22 2.37 57435 0.04 0.04 File#join
1.76 109.55 2.33 57435 0.04 0.04 Array#unshift
1.72 111.83 2.28 57437 0.04 0.04
String#initialize_copy
1.63 113.99 2.16 57436 0.04 0.04 File#exist?
1.60 116.10 2.11 57436 0.04 0.04 File#file?
1.54 118.14 2.04 57435 0.04 0.04 Kernel.untaint
1.42 120.02 1.88 57431 0.03 0.03 File#lstat
1.28 121.71 1.69 57431 0.03 0.03
File::Stat#directory?

On Oct 14, 10:52 am, Phrogz [email protected] wrote:

On Oct 13, 7:50 pm, Brian A. [email protected] wrote:

Hmm… it just occurred to me that many of the solutions presented here
have the flaw of potentially calling File.atime() multiple times for
the same file which would require unnecessary calls to the operating
system to get the access time of the file.

Really? Which ones?

Uh, those would be mine :frowning: Moot point though (see other post).

You do realize that #sort_by is explicitly
designed to call the comparison method exactly once for each object,
right?

Actually, I had missed that. Thanks for pointing it out. This seems to
be a case where sort_by is certainly warranted.