Deleting of files older than a stipulated date

Pardon for my multiple posts in this forum as Im rushing a project but
still quite new in ruby…

Anyway, I would like to select files to delete based on their
datestamped folders.
For example, I would like to delete files which older than 5 business
days (i.e that means you dont count sats and suns). Currently my code
looks like this:
def delFiles
sd_a=$del_path.zip($del_selection)
sd_a.each do |sd|
$del_path, $del_selection = sd
del = File.join $del_path, $del_selection
puts “Files/Folders Deleted: #{del}”
FileUtils.rm_r Dir.glob(del)
end #each
end #delFiles

Im actually using arrays to contain my del_path and del_selection
because I will need to delete multiple directories or files all at once.
Is there any way where I can build on my code but with the criteria that
only files older than 5 business days (bearing in mind that files not
only contain the date eg. 20080331 but can also come with other
characters eg. risk20080331) get deleted.
Much help is really appreciated. :wink:

On Thu, Apr 17, 2008 at 4:53 AM, Clement Ow
[email protected] wrote:

sd_a.each do |sd|
only files older than 5 business days (bearing in mind that files not
only contain the date eg. 20080331 but can also come with other
characters eg. risk20080331) get deleted.
Much help is really appreciated. :wink:

If you mean that the date is part of the name, I have a script that
does something
similar. Let me paste you the relevant parts (this is not a complete
program,
it’s just the part that makes the checks against the dates).

Here I receive in params the list of folders to process,
and the age of files for zipping and deleting. So for example
to say: delete files older than 15 days, zip files older than 7 days
I pass delete=15, zip=7. In this part I calculate the dates to
check:

folders = params[:directories].values
have_to_zip = params[:zip].given?
zip = params[:zip].value
if have_to_zip
  zip_date = DateTime.now - zip
end
delete = params[:delete].value
delete_date = DateTime.now - delete

Now I build a regexp to match the file names and extract the date from
them:

regexp = Regexp.compile(/^(\d\d\d\d-\d\d-\d\d).*\.log(\.gz)?$/)

Now I traverse the folders, trying to match the filenames against the
regexp.
When I find a match, I check the date in the name against the delete
date
and the zip date and act accordingly, storing info for a report:

fileData = Struct.new(:name, :size)
deleted_files = []
zipped_files = []

folders.each do |folder|
  Find.find(folder + "/") do |file|
    match = regexp.match(File.basename(file));
    if match
      file_date = DateTime.parse(match[1])
      size = File.stat(file).size
      if delete_date > file_date
        deleted_files << fileData.new(file,size)
        File.delete(file)
      elsif have_to_zip && zip_date > file_date && !match[2]
        zipped_files << fileData.new(file,size)
        `gzip -f #{file}`
      end
    end
  end
end

If you want to check the actual modification date of the file,
I think File.stat can help on that, or I don’t know if File.find has
an option to search
for files based on date.

Hope this helps.

Jesus.

Jesús Gabriel y Galán wrote:

If you want to check the actual modification date of the file,
I think File.stat can help on that, or I don’t know if File.find has
an option to search
for files based on date.

Hope this helps.

Jesus.

Are you using a hash to contain the file paths that you want to delete?
It pretty much works about the same as the array that I have created eh?
btw, is it possible to attach your script so that I can use it for
reference? Thanks!

On Thu, Apr 17, 2008 at 11:19 AM, Clement Ow
[email protected] wrote:

Jesus.

Are you using a hash to contain the file paths that you want to delete?
It pretty much works about the same as the array that I have created eh?
btw, is it possible to attach your script so that I can use it for
reference? Thanks!

I’m using the “main” gem to process input parameters. The folders
variable is an array, I think.
The full script (it uses an erb template to compose the email report,
which I don’t think it’s interesting):

require ‘date’
require ‘find’
require ‘simplemail’
require ‘main’
require ‘erb’

the next two are just for the number_to_human_size method

require ‘action_controller’
require ‘action_view’
include ActionView::Helpers::NumberHelper

TEMPLATE_FILE = File.join(File.dirname(FILE),
“deletelogs_template.erb”)

main {
description <<-DESC
Deletes or gzips jhub log files older than the specified dates,
searching the specified directories
recursively. The files should match this regexp
/^(\d\d\d\d-\d\d-\d\d).*.log(.gz)?$/
DESC

option(“zip”, “z”) {
argument :required
description “Zip files older than the specified number of days”
cast :int
}
option(“delete”, “d”) {
argument :required
defaults 7
description “Delete files older than the specified number of days.
If --zip option is specified, only delete the files that are in
between
both dates”
cast :int
}
argument(“directories”) {
arity -2
}

def disk_usage
df -h
end

def run
folders = params[:directories].values
have_to_zip = params[:zip].given?
zip = params[:zip].value
if have_to_zip
zip_date = DateTime.now - zip
end
delete = params[:delete].value
delete_date = DateTime.now - delete

usage_before = disk_usage
regexp = Regexp.compile(/^(\d\d\d\d-\d\d-\d\d).*\.log(\.gz)?$/)

fileData = Struct.new(:name, :size)
deleted_files = []
zipped_files = []

folders.each do |folder|
  Find.find(folder + "/") do |file|
    match = regexp.match(File.basename(file));
    if match
      file_date = DateTime.parse(match[1])
      size = File.stat(file).size
      if delete_date > file_date
        deleted_files << fileData.new(file,size)
        File.delete(file)
      elsif have_to_zip && zip_date > file_date && !match[2]
        zipped_files << fileData.new(file,size)
        `gzip -f #{file}`
      end
    end
  end
end
usage_after = disk_usage
report = ERB.new(File.read(TEMPLATE_FILE), nil, "%<>")
report = report.result(binding)
SimpleMail.deliver_simple('email address', 'email address',

‘Delete old log files’, report)
end
}

SimpleMail is just a class that inherits ActionMailer and has the smtp
info and a simple method to compose the email.
Any comment on how to make this more efficient or better is
appreciated…

Jesus.

Currently this is how my code looks like:

delete=delete + 2
folders = $del_path
delete_date = DateTime.now - delete

regexp = Regexp.compile(/(\d{4}\d{2}\d{2})/)

fileData = Struct.new(:name, :size)
deleted_files = []

folders.each do |folder|
Find.find(folder + “/”) do |file|
match = regexp.match(File.basename(file));
if match
puts file_date = DateTime.parse(match[1])
size = File.stat(file).size
if delete_date > file_date
deleted_files << fileData.new(file,size)
puts “delete files: #{file} size: #{size} bytes”
#File.delete(file)
end
end
end
end

Currently, when i specify the folder path to be C:/Test, it begins
searching for files that are in C:/Test/New as well. So is there any way
that the command doesnt traverse the folders to match the regexp? (i.e
only match C:/Test if the path specified is C:/Test)Thanks in advance!

Thanks Jesus! I modified the codes abit and it currently is a handy
deletion script:

delete=5

I add 2 to escape couting the weekends, as i only want to delete files older >>than 5 business days i.e not including Sat and Sun.
delete=delete + 2
folders = $del_path
delete_date = DateTime.now - delete

regexp = Regexp.compile(/(\d{4}\d{2}\d{2})/)

fileData = Struct.new(:name, :size)
deleted_files = []

folders.each do |folder|
  Find.find(folder + "/") do |file|
    match = regexp.match(File.basename(file));
    if match
      puts file_date = DateTime.parse(match[1])
      size = File.stat(file).size
      if delete_date > file_date
        deleted_files << fileData.new(file,size)
        puts "delete files: #{file} size: #{size} bytes"
        #File.delete(file)
        end
    end
  end
end

However, there will be flaws and bugs when I change to: delete =2…
Any ways to improve the code are most welcome :wink:

On Mon, Apr 21, 2008 at 9:10 AM, Clement Ow
[email protected] wrote:

folders.each do |folder|
Find.find(folder + “/”) do |file|
[…]
end
end

Currently, when i specify the folder path to be C:/Test, it begins
searching for files that are in C:/Test/New as well. So is there any way
that the command doesnt traverse the folders to match the regexp? (i.e
only match C:/Test if the path specified is C:/Test)Thanks in advance!

You mean you want just to list the first level, without recursively
processing
the subfolders? Then you don’t need the Find module, you can use:

Dir.glob(folder + “/*”) do |file|

This will list subfolders too, so you can skip them with:

next if File.directory? file
[…]
end

If you want to do more fine grained pruning of which subfolders to
recurse
you can use Find.prune. This is an example to achieve the same as above:

irb(main):046:0> Find.find(“/home/jesus/”) do |file|
irb(main):047:1* if ((File.directory? file) && !(file ==
“/home/jesus/”))
irb(main):048:2> Find.prune
irb(main):049:2> end
irb(main):050:1> puts file
irb(main):051:1> end

Hope this helps,

Jesus.

Hope this helps,

Jesus.

Thanks for taking time to help me with my script. I really appreciate
it!

Cheers,
Clement :wink: