Finding Duplicate MP3s

I’m a RubyNoobie and I am writing a few random scripts to learn the
language. Would it be alright to post scripts to get them criticized by
pros?

This is supposed to search a folder to find duplicate MP3s and move them
to a destination directory.

*When you search a directory structure with lots of branches then
mp3info gives an error.
*The source and destination folders NEED to have the backslash \ or else
the script does horrible things. So it needs better input checking

I don’t always know the best programmer practices and it would help me
if someone can make suggestions.

Finds duplicate MP3s and move them to stated destination

usage: duplicatemp3.rb c:\source\dir\ c:\destination\dir\

the ending \ slash is REQUIRED

require “mp3info”
require “ftools”

class DuplicateMP3
def initialize()
unless ARGV.length == 2
puts “usage: duplicatemp3.rb c:\source\dir\ c:\destination\dir”
exit
end
@dir_source = ARGV[0]
@dir_destination = ARGV[1]

showallduplicates()

end

def showallduplicates()
directory = @dir_source
duplicates = Hash.new { |h,k| h[k] = [] }
Dir.chdir(@dir_source)

#Check if the file is a file then
#Make a unique ID with the mp3 title and mp3 song length combined
#Enter unique into the hash as a key
#If unique key is already in the hash then move the file to destination
folder

Dir["**/*.mp3"].each do |file|
  if File.file?(file)
    Mp3Info.open(file) do |mp3|
      unique = mp3.tag.title.to_s + mp3.length.to_s
      if (duplicates.has_key?(unique))
        puts "duplicate: #{file}. Moving..."
        movedupes(file)
      else
          duplicates[unique].push(file)
      end
    end
  end
end

end

def movedupes(file)
source = @dir_source + file.to_s
dest = @dir_destination + File.basename(file.to_s)
File.move(source, dest)
puts “file moved”
end

end

find = DuplicateMP3.new()

On Nov 30, 2007 7:16 PM, Khurrum Ma [email protected] wrote:

I don’t always know the best programmer practices and it would help me

Sure.

class DuplicateMP3
def initialize()
unless ARGV.length == 2
puts "usage: duplicatemp3.rb c:\source\dir\ c:\destination\dir"
exit
end
@dir_source = ARGV[0]
@dir_destination = ARGV[1]

showallduplicates()

Meh. I’d pass source and dest to initialize directly. I’d call
showallduplicates() on the DuplicateMP3 object directly. If you really
want, add a class method like

def self.do_it(src, dest)
mover = new(src,dest)
mover.showallduplicates()
end

Maybe. I’m shaky on this actually needing to be a class, but I guess it
works okay…

end

folder

Here’s the biggest thing: look into using Find (in the standard
library.)
It’s almost always superior to Dir, unless you really don’t want to
recurse
at all.

Dir[“**/*.mp3”].each do |file|
if File.file?(file)
Mp3Info.open(file) do |mp3|
unique = mp3.tag.title.to_s + mp3.length.to_s

I’d use [mp3.tag.title.to_s, mp3.length] instead. It’s certainly a
corner
case, but it’d be a shame if you had “song” that was 210 long treated as
a
dupe of “song2”, 10 long.

     if (duplicates.has_key?(unique))
       puts "duplicate: #{file}. Moving..."
       movedupes(file)
     else
         duplicates[unique].push(file)

Meh again. Use a regular hash, and replace this line with
“duplicates[unique] = true” since you don’t ever use the values of the
duplicates hash.

     end
   end
 end

end
end

def movedupes(file)

And the solution to your input problem is:
“source = File::join(@dir_source, file.to_s)”
If you use Find, it’ll be a little easier here, and tricker in the
destination, because Find will give you the full path anyway.

source = @dir_source + file.to_s

Instead of File.basename, you’ll want to use something else. If you’re
using Find, you’ll get reasonably full paths, so you might be able to do
something like
file.to_s.sub(%r{^#{Regexp::escape(@dir_source)}}, “”)
which will find the source path and replace it with an empty string. I
don’t know a better way work with the paths, honestly.
And you’d use File::join again here, as well.

dest = @dir_destination + File.basename(file.to_s)
File.move(source, dest)
puts “file moved”
end

end

find = DuplicateMP3.new()

Ultimately, I think you want to have this look like
if($0 == FILE)
#Put the ARGV.length check here
find = DuplicateMP3.new(ARGV[0], ARGV[1])
find.showallduplicates()
end

Judson L. wrote:

On Dec 3, 2007 4:55 PM, Judson L. [email protected] wrote:

Thank you very much. This is incredibly helpful. I’m sure this will be
useful to other beginners as well.

On Dec 3, 2007 4:55 PM, Judson L. [email protected] wrote: