How Check File Busy?

I want to use Ruby to periodically move files out of an FTP upload
folder to another location. Before I move any one file, I need to know
that it is not in the middle of being uploaded.

Other than doing something really barbaric like check the file size,
wait some time period and check the size again, I can’t figure out how
Ruby would know if a file not in use by Ruby itself is “busy.”

Is there something in Ruby to do this, or do I use some OS-level
command?

OS X 10.5. Ruby 1.8.

Thanks for any suggestions.

– gw

Do you require this on a local machine, or only via FTP?

I think FTP is extremely limited - not sure if it allows that much.

Marc H. wrote:

Do you require this on a local machine, or only via FTP?

I think FTP is extremely limited - not sure if it allows that much.

Sorry, clarification… all ruby work is local. It just so happens that
for the file move, the source folder functions as an FTP target, but
I’ll be working with it locally. So, the entire move process is a local
file system, it’s just that the source files may be in the middle of
being uploaded (they’re large, and take several minutes to transfer in).

– gw

2009/4/28 Greg W. [email protected]:

Marc H. wrote:

Do you require this on a local machine, or only via FTP?

I think FTP is extremely limited - not sure if it allows that much.

Sorry, clarification… all ruby work is local. It just so happens that
for the file move, the source folder functions as an FTP target, but
I’ll be working with it locally. So, the entire move process is a local
file system, it’s just that the source files may be in the middle of
being uploaded (they’re large, and take several minutes to transfer in).

IMHO it’s difficult to detect that. You could use shell utility
“fuser” if available on your platform. In a shell script you could do

while fuser “$file” >/dev/null; do
sleep 1
done
cp “$file” “$somewhere_else”

Maybe you can use an FTP server which allows to trigger post
processing after an upload has finished. I am not sure about FTP
library, but if it contains a server part you might even be able to
cook one yourself.

Kind regards

robert

You can pass the filename to lsof this is usually pretty quick.

On Apr 28, 2009 10:31 AM, “dragonyy” [email protected] wrote:

On 4月28æ—¥, 下午5時17分, Robert K. [email protected] wrote: > >
IMHO it’s difficult to dete…
Once I had the same problem transmitting by SSH, and then I used an
external tool ‘lsof’ to
report those processes opening the target files. Maybe you can do this
way. Though sometimes
lsof is really slow to return.

regards
dragon

On 4e$B7ne(B28e$BF|e(B, e$B2<8ae(B5e$B;~e(B17e$BJ,e(B, Robert K.
[email protected] wrote:

processing after an upload has finished. I am not sure about FTP
library, but if it contains a server part you might even be able to
cook one yourself.

Kind regards

robert


remember.guy do |as, often| as.you_can - without endhttp://blog.rubybestpractices.com/

Once I had the same problem transmitting by SSH, and then I used an
external tool ‘lsof’ to
report those processes opening the target files. Maybe you can do this
way. Though sometimes
lsof is really slow to return.

regards
dragon

On Wed, Apr 29, 2009 at 01:32:24AM +0900, Joel VanderWerf wrote:

open file handle, it will keep writing to the same file. Of course, if the
writer opens a new file handle each time data comes in, then this won’t
work. And maybe you don’t want new writes after the file is moved, because
of your application logic. So probably Robert K.'s suggestion to use
fuser is apt.

A few other options:

  1. If you control the ftp upload, simply upload to a temporary name,
    upon
    completion of the upload, submit an ftp rename/move command to rename
    the
    temprorary name to the final name.

    On the directory monitoring side, only look for files with the the
    final name
    pattern. Since a mv is generally an atomic operation (so long as the
    move
    is on the same file system) then this is probably the best solution.

  2. Use the transfer log of your ftp server and monitor that for when a
    file up
    load is complete. This assumes that the appropriate log message is
    written
    after the uploaded file is complete, and that you are logging.

  3. See if your ftp server has a ‘post upload hook’ or some method of
    executing a
    predefined script upon upload completion. A quick google on this one
    turned
    up PureFTP has as ‘–with-uploadscript’ compilation option that
    allows
    something like this.

enjoy,

-jeremy

Greg W. wrote:

I want to use Ruby to periodically move files out of an FTP upload
folder to another location. Before I move any one file, I need to know
that it is not in the middle of being uploaded.

Other than doing something really barbaric like check the file size,
wait some time period and check the size again, I can’t figure out how
Ruby would know if a file not in use by Ruby itself is “busy.”

As long as the place you are moving the file to is on the same
partition, why can’t you just move it and let the upload proceed? If the
writer has an open file handle, it will keep writing to the same file.
Of course, if the writer opens a new file handle each time data comes
in, then this won’t work. And maybe you don’t want new writes after the
file is moved, because of your application logic. So probably Robert
Klemme’s suggestion to use fuser is apt.

Hey everyone, thanks for the input.

My FTP server does indeed support post processing. Should have thought
of that. However, I’d rather not couple this particular process to the
FTP server if possible.

I read up on fuser and lsof, and will experiment with those options
first.

I have some options now, so thanks.

– gw

Greg W. wrote:

Hey everyone, thanks for the input.

My FTP server does indeed support post processing. Should have thought
of that. However, I’d rather not couple this particular process to the
FTP server if possible.

I read up on fuser and lsof, and will experiment with those options
first.

I have some options now, so thanks.

– gw

You can “easy” write a small C snipplet that checks if something happens
to a file and then make a ruby-class of this and use it in your ruby
program.

Here is a sketch…

——

#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/event.h>
#include <sys/time.h>

#define TMOUT_SEC 2
#define TMOUT_NSEC 0

//**********************************************************************
// This handles events. It blocks on kqueue.
//**********************************************************************
handle_events(int kq )
{

const struct kevent ch;
struct kevent ev;
int nchanges;
int nevents;
int error;
int i;

int n;
struct timespec timeout =
{ TMOUT_SEC, TMOUT_NSEC };

n = kevent(kq, &ch , nchanges,
&ev, nevents, &timeout );

if( n <= 0 ) {
perror(error); /* what kind of error */
exit(-1);
}

for( i = 0 ; i < n; i++ ) {

if(ev[i].flag & EV_ERROR)
  exit(-1); /* error */

if(ev[i].filter == EVFILT_READ)
  readable_fd(ev[i].ident);
if(ev[i].filter == EVFILT_WRITE)
  writable_fd(ev[i].ident);

}
}

readable_fd( int fd )
{
printf(“File_fd to read %d\n”, fd );
}

writable_fd( int fd )
{
printf(“File_fd to read %d\n”, fd );
}

update_fd(int fd, int action,
int filter )
{

ch[nchanges].ident = fd;
ch[nchanges].filger = filter;
ch[nchanges].flags =
action == ADD ? EV_ADD
: EV_DELETE;
nchanges++;

}

main()
{
int kq;
kq = kqueue();
printf(“Start of program\n”);
handel_events( kq );
}

And then you need to make it to a ruby class.

Read in the ruby book how to do this…

#include “ruby.h”
#include <stdlib.h>

static int id_push;

static VALUE t_init( VALUE self )
{

}

static VALUE t_notify( VALUE self, VALUE obj )
{

}

VALUE cTEST;

void Init_my_test(){
cTEST = rb_define_class(“myTest”, rb_cObject);
rb_define_method( cTest, “initialize”, t_init, 0);
rb_define_method( cTest, “notify”, t_notify, 1);
id_push = rb_intern(“push”);
}

Then compile and use

——

require ‘my_test’

t = MyTest.new

look for close or write or name change…

while…
t.notify(“filename”);