Generate unique filenames


#1

Hi list,

What is the best/easiest way to generate unique filenames ? Couln’t
find such method in Ruby :frowning:


#2

On Tue, May 02, 2006 at 01:17:13AM +0900, 13 wrote:

What is the best/easiest way to generate unique filenames ? Couln’t
find such method in Ruby :frowning:

filename = random_number()

while (does_file_exist(filename)) {
filename = rand()
}

print filename " is unique"

Cheers,
Phil


#3

I’d do something w/ rand() or do a hash of the current system time.
Something like that.


#4

On Tue, 2 May 2006, 13 wrote:

Hi list,

What is the best/easiest way to generate unique filenames ? Couln’t
find such method in Ruby :frowning:

i use this

harp:~ > cat a.rb
require ‘socket’
require ‘time’
require ‘tmpdir’

class File
def self.tmpnam opts = {}, &b
dir = opts[‘dir’] || opts[:dir] || Dir.tmpdir
seed = opts[‘seed’] || opts[:seed] || $0
path =
“%s_%s_%s_%s_%d” % [
Socket.gethostname,
seed,
Process.pid,
Time.now.iso8601(2),
rand(101010)
]
dirname, basename = split path
tn = join(dir, dirname,
basename.gsub(%r/[^0-9a-zA-Z]/,’’)).gsub(%r/\s+/, '’)
tn = expand_path tn
b ? open(tn,‘w+’, &b) : tn
end
end

p File.tmpnam
File.tmpnam{|f| f.write 42; p f.path}

harp:~ > ruby a.rb
“/tmp/harp_ngdc_noaa_gov_a_rb_22545_2006_05_01T10_44_44_86_06_00_34185”
“/tmp/harp_ngdc_noaa_gov_a_rb_22545_2006_05_01T10_44_44_86_06_00_80469”

it’s unique even on network filesystems.

-a


#5

2006/5/1, 13 removed_email_address@domain.invalid:

Hi list,

What is the best/easiest way to generate unique filenames ? Couln’t
find such method in Ruby :frowning:

You can use tempfile
http://ruby-doc.org/stdlib/libdoc/tempfile/rdoc/index.html

Kind regards

robert


#6

13 wrote:

Tempfile creates an empty file (in /tmp by default) even if I don’t
write in it. It’s not very good IMO. I rather choose some self made
random number based solution. It would be great if there were some
Tempfile class method like Tempfile.name(basename, tmpdir=Dir::tmpdir)
which returns just a name without creating a file.

You’re entitled to your opinion… but…

The way that you’re asking for it to work is badly broken, because in
that case, you may attempt to use the filename and find that it’s no
longer unique.

–Steve


#7

On 5/1/06, 13 removed_email_address@domain.invalid wrote:

Hi,

Tempfile creates an empty file (in /tmp by default) even if I don’t
write in it. It’s not very good IMO. I rather choose some self made
random number based solution. It would be great if there were some
Tempfile class method like Tempfile.name(basename, tmpdir=Dir::tmpdir)
which returns just a name without creating a file.

It’s creating the file immediately to prevent other processes or
threads from creating/accessing the same filename. Depending on what
you’re trying to do, that could be very handy.


#8

On Tue, 2 May 2006, Bill G. wrote:

threads from creating/accessing the same filename. Depending on what
you’re trying to do, that could be very handy.

it seems like that, but it does not. File::EXCL is broken on many
filesystems
and, worse, fails silently. one needs to use a tmpnam algorithm with
hostname/pid/time/rand to minimize the chances of dup names or use a
form of
locking which works on all filesystems. afaik the only portable atomic
fs
operation is ‘link’ and, ergo, the locking mechanism must be built on
top of
this. my lockfile class has an impl if your interested - just wanted to
point
out that the file creation in tempfile.rb is anything but atomic and
fails
silently when atomicity fails.

fyi.

-a


#9

Hi,

Tempfile creates an empty file (in /tmp by default) even if I don’t
write in it. It’s not very good IMO. I rather choose some self made
random number based solution. It would be great if there were some
Tempfile class method like Tempfile.name(basename, tmpdir=Dir::tmpdir)
which returns just a name without creating a file.

Thanks to all who helped !


Martins


#10

On May 01, 2006, at 19:33, 13 wrote:

I rather choose some self made random number based solution.

Don’t be shy, use the big guns:

http://www.ietf.org/rfc/rfc4122.txt

http://sporkmonger.com/projects/uuidtools/
http://trac.labnotes.org/cgi-bin/trac.cgi/wiki/Ruby/UuidGenerator

Cheers


#11

On May 1, 2006, at 10:33 AM, 13 wrote:

Hi,

[snipping all around]

I rather choose some self made random number based solution.


Martins

Rather than trying to setup a random number based solution, how about
using the fairly common (on *nix) uuidgen program?

Regarding Tempfile, empty files are usually fairly cheap. I think the
guarantee that your file will be available is definitely something to
consider.

Paul K.


#12

On Monday 01 May 2006 12:00 pm, removed_email_address@domain.invalid wrote:

it seems like that, but it does not. File::EXCL is broken on many
filesystems and, worse, fails silently. one needs to use a tmpnam
algorithm with hostname/pid/time/rand to minimize the chances of dup names
or use a form of locking which works on all filesystems. afaik the only
portable atomic fs operation is ‘link’ and, ergo, the locking mechanism
must be built on top of this. my lockfile class has an impl if your

The link implementation in your lockfile lib won’t work right on, at the
very
least Windows XP.

I have a modified version of it in my IOWA dev tree that, if not told
what
style of locking to do, tests the matter when a lockfile is first
created and
attempts to guess at the appropriate flavor of lockfile.

It is, currently, a less than wholly elegant integration with a somewhat
simplified version of your lockfile.rb, but it works and passes its
tests on
WinXP as well as the couple different flavors of Linux I have tested on.

Kirk H.


#13

On Tue, 2 May 2006, Kirk H. wrote:

least Windows XP.

I have a modified version of it in my IOWA dev tree that, if not told what
style of locking to do, tests the matter when a lockfile is first created and
attempts to guess at the appropriate flavor of lockfile.

It is, currently, a less than wholly elegant integration with a somewhat
simplified version of your lockfile.rb, but it works and passes its tests on
WinXP as well as the couple different flavors of Linux I have tested on.

cool. i’ve never even attempted this on any windowsy platform. maybe
we
should merge? a good lockfile class is a must for any production
system, as
anyone who’s even come back from a 4 days weekend to find 4000 stacked
and
hung crontab processes knows :wink:

cheers.

-a


#14

On 5/1/06, removed_email_address@domain.invalid removed_email_address@domain.invalid wrote:

cool. i’ve never even attempted this on any windowsy platform. maybe we
should merge? a good lockfile class is a must for any production system, as
anyone who’s even come back from a 4 days weekend to find 4000 stacked and
hung crontab processes knows :wink:

Perhaps it could replace tempfile? Anyway, thanks for the heads up,
didn’t know about the problems with it (obviously).


#15

Steve, you’re right !

If I will need to be so bulletproof then I will create a list of
filenames that I have created and delete them sometime.

Thanks, for pointing me to this problem.


Martins


#16

2006/5/1, Paul K. removed_email_address@domain.invalid:

Rather than trying to setup a random number based solution, how about
using the fairly common (on *nix) uuidgen program?

Regarding Tempfile, empty files are usually fairly cheap. I think the
guarantee that your file will be available is definitely something to
consider.

Other advantages:

  • tempfile ships with every ruby install
  • tempfiles are guaranteed to be removed on process termination
  • if you are going to need it anyway, you can as well create it
    directly, remember, you can control the point in time when the
    tempfile instance is created and thus when the file will be created
  • tempfile is for “free” while you’ll have to maintain every homegrown
    solution

Cheers

robert


#17

On Sat, 6 May 2006, Robert K. wrote:

  • tempfile ships with every ruby install
  • tempfiles are guaranteed to be removed on process termination
  • if you are going to need it anyway, you can as well create it
    directly, remember, you can control the point in time when the
    tempfile instance is created and thus when the file will be created
  • tempfile is for “free” while you’ll have to maintain every homegrown
    solution

mostly i agree with all of this. however, i’ve had serious issues with
tempfile because this is not quite true. in particular a process
that
dies under ‘sig 9’ or ‘exit!’

sig -9:

 harp:~/.mp3/kcrw > ruby -r tempfile -e'  puts 

Tempfile.new($$.to_s){|f| f.puts $$}.path; Process.kill -9, $$ ’
/tmp/2901929019.0
Killed

 harp:~/.mp3/kcrw > ls /tmp/2901929019.0
 /tmp/2901929019.0

exit!:

 harp:~ > ruby -r tempfile -e'  puts Tempfile.new($$.to_s){|f| 

f.puts $$}.path; exit! ’
/tmp/2916129161.0

 harp:~ > ls /tmp/2916129161.0
 /tmp/2916129161.0

so you can fill up /tmp with a sick process. i now use this code as
part if a
method which generates a tmpdir that i use to get a clean workspace,
tmpdirs
are also removed after program exit, and it has a very important
feature: is
knows it’s naming scheme and can ‘clean-up’ after itself. in otherwords
each
time i generate a tmpdir old dead ones are looked for and blown away: it
impiments it’s own tmpwatch. so, for most code tempfile is ok, be
people
should be aware that it is not ‘guaranteed’ to clean up after itself and
it’s
mkdir based atomic creation is not nfs safe at all - eg. you can end up
with
two tempfiles with the same name on nfs.

anyhow, here’s a bit of the code i now use, snipped out of my personal
lib
(alib - on rubyforge):

generate a temporary filename for the directory dir using seed as a

basename

 def tmpnam(*argv)

#–{{{
args, opts = argv_split argv
dirname = argv.shift || getopt(%w(dir base prefix), opts, ‘.’)
seed = getopt ‘seed’, opts, prognam
reap = getopt ‘reap’, opts, true

   dirname = File.expand_path dirname

   seed = seed.gsub(%r/[^0-9a-zA-Z]/,'_').gsub(%r/\s+/, '')
   host = hostname.gsub(%r/\./, '_')

   if reap
     begin
       baseglob = "%s__*__*__*__%s" % [ host, seed ]
       glob = File.join(dirname, baseglob)
       host_re = %r/^#{ host }$/
       candidates = Dir[glob]
       candidates.each do |candidate|
         basename = File.basename candidate
         parts = basename.split %r/__/, 5
         if parts[0] =~ host_re
           pid = Integer parts[1]
           unless alive? pid
             FileUtils.rm_rf candidate
           end
         end
       end
     rescue => e
       warn(errmsg(e)) rescue nil
     end
   end

   basename =
     "%s__%s__%s__%s__%s" % [
       host,
       Process::pid,
       timestamp('nospace' => true),
       rand,
       seed,
     ]

   File.join(dirname, basename)

#–}}}
end
export ‘tmpnam’

this name is then combined with my lockfile lib which atomically creates
files

  • even on nfs - to make tempfiles or tempdirs.

the key is that it looks for old tempfiles and ‘reaps’ them iff the name
indicates the file was created on this host (otherwise we cannot signal
the
other process to see if it’s still alive). the reason i’m pointing all
this
out is that one could easily wrap such behaviour on top of the built-in
tmpfile and it’s a valid concern in 24x7 production systems which must
not
fill disk.

regards.

-a