File#flock with LOCK_SH causes Errno::EBADF on Solaris

Hi all,

Ruby 1.8.6
Solaris 10

flock_test.rb

fh = File.open(‘flock_test.txt’, ‘w’)
fh.puts ‘hello’
fh.flock(File::LOCK_SH)
fh.flock(File::LOCK_UN)
fh.close

On Linux no error is raised. On Solaris 10 I get:

flock_test.rb:3:in `flock’: Bad file number - flock_test.txt
(Errno::EBADF)
from flock_test.rb:3

What’s the deal?

Thanks,

Dan

“D” == Daniel B. [email protected] writes:

Try it with

D> fh = File.open(‘flock_test.txt’, ‘w’)

fh = File.open(‘flock_test.txt’, ‘w+’)

Guy Decoux

On Feb 18, 9:48 am, ts [email protected] wrote:

“D” == Daniel B. [email protected] writes:

Try it with

D> fh = File.open(‘flock_test.txt’, ‘w’)

fh = File.open(‘flock_test.txt’, ‘w+’)

Yes, that worked. Do Solaris and Linux have different rules regarding
shared file locks? Or is that a bug?

Thanks,

Dan

“D” == Daniel B. [email protected] writes:

D> Yes, that worked. Do Solaris and Linux have different rules regarding
D> shared file locks? Or is that a bug?

bug in Solaris, you want to say ? :slight_smile:

 EBADF The fildes argument is not a valid open file  descrip-
       tor;  or  the   cmd  argument  is  F_SETLK, F_SETLK64,
       F_SETLKW, or F_SETLKW64, the type of lock, l_type,  is
       a  shared  lock  (F_RDLCK),  and fildes is not a valid
       file descriptor open for reading; or the type of  lock
       l_type  is  an  exclusive lock (F_WRLCK) and fildes is
       not a valid file descriptor open for writing.

Guy Decoux

On Feb 18, 2008, at 2:09 PM, Douglas Wells wrote:

In any case, right now Ruby seems to just use the OS’s flock
function, and I don’t think that it would be reasonable for the
Ruby implementation to attempt to get around this as it would be
far too messy – and slow. I’d just live with the problem.

http://codeforpeople.com/lib/ruby/posixlock/posixlock-0.0.1/README

you really can’t use flock on *nix systems anyhow, unless you are very
careful - flock is sometime implemented in terms of posixlock,
sometimes not. when it is not the locks are incompatible and break on
many filesystems.

fyi

a @ http://drawohara.com/

In article
[email protected],
Daniel B. [email protected]lid writes:

On Feb 18, 9:48 am, ts [email protected] wrote:

“D” == Daniel B. [email protected] writes:

Try it with

D> fh = File.open(‘flock_test.txt’, ‘w’)

fh = File.open(‘flock_test.txt’, ‘w+’)

(I no longer have access to a Solaris machine, so I’m assuming in
my response here that Solaris does require write access for this.)

Yes, that worked. Do Solaris and Linux have different rules regarding
shared file locks? Or is that a bug?

It would appear that Solaris and Linux do, in fact, have different
rules. It’s a bug if you want it to be a bug!

The problem here is that there is no controlling authority. The
flock procedure is not part of the POSIX/UNIX standard. It was,
I believe, created as part of BSD 4.2, and like much non-standardized
software, its documentation is rather loose and the function is
mostly defined by its implementation.

The procedure is defined as part of the Linux standard, but the
documentation there is really just an abstraction of the man page.
Neither the Linux standard nor the BSD documentation specifies
what access (if any) is needed to take a lock (via flock) on a
file. And, of course, Solaris is not bound by the Linux standard.

The “obvious” way to implement flock on a POSIX system is to use
the locking facilities of fcntl(2). But, in that case the POSIX
standard requires that the file descriptor have read access for
shared access. (It is likely that ruby translated the ‘w’ access
to write-only access, which does not include read access). It
sounds like Solaris may have taken that path. (There are other
more subtle differences between the behavior of BSD flock and POSIX
fcntl, but that’s not relevant here.)

Given that the Solaris implementation differs from the original,
presumably defining, BSD implementation, you might report the
problem to Sun to get its view. On the other hand, since flock
has existed in the Solaris system for 15+ years now, I suspect
that the Solaris implementors know about its behavior – and lack
of correspondence to the BSD version. (One could argue that there
is a denial-of-access security issue associated with the BSD/Linux
behavior.)

In any case, right now Ruby seems to just use the OS’s flock
function, and I don’t think that it would be reasonable for the
Ruby implementation to attempt to get around this as it would be
far too messy – and slow. I’d just live with the problem.

  • dmw

In article [email protected], ara howard
[email protected] writes:

On Feb 18, 2008, at 2:09 PM, Douglas Wells wrote:

In any case, right now Ruby seems to just use the OS’s flock
function, and I don’t think that it would be reasonable for the
Ruby implementation to attempt to get around this as it would be
far too messy – and slow. I’d just live with the problem.

http://codeforpeople.com/lib/ruby/posixlock/posixlock-0.0.1/README

Interesting – and very confusing. I (purposely) didn’t look at
the source, but that documentation says that it’s based on fcntl,
yet one interface is named lockf. Unfortunately, in POSIX the
fcntl lock facility and lockf are distinct and behave differently.
So, if the module really has implemented lockf in terms of fcntl,
that’s a big violation of the principle of least surprise.

you really can’t use flock on *nix systems anyhow, unless you are very
careful - flock is sometime implemented in terms of posixlock,
sometimes not. when it is not the locks are incompatible and break on
many filesystems.

Can you elaborate on that, please? Do you mean just in Ruby, or
in POSIX systems in general? I don’t understand the latter claim.
POSIX doesn’t allow me to intermix the locking facilities in any
case, but the OS level flock ought to work as long as one only
uses flock in the other programs. And, I’ve never even heard of
anything other than simple implementation bugs in 20+ years of
using it.

On the other hand, if you mean in Ruby, I’ll note that the Ruby
documentation is quite sparse and uses many terms without defining
them (e.g., modification time, truncate, read permission) and often
explicitly refers to the OS platform documentation (e.g., “On Unix
systems, see chmod(2) for details.”). So, I just assumed that
File#flock meant use the OS flock. If I can’t count on that, that
means that I can’t count of interacting with non-Ruby programs via
the use of any kind of file locking.

The FreeBSD documentation says that all three locking forms interact
(i.e., taking a lock with one type would temporarily block a process
taking another). At least one instance of Linux documentation,
however, notes that there is “no interaction between the types of
lock placed by flock(2) and fcntl(2).” I don’t have time to test
this right now, but I read that as meaning that two separate
processes could concurrently each successfully take a lock on the
same file. If I can’t easily determine which locking method will
be used by Ruby, this situation would make this would make file
locking extremely dodgy and totally non-portable in Ruby.

fyi

Thanks for the info, but now I’m extremely concerned about the
viability of any type of file locking in Ruby. Can you tell me
which versions of Ruby don’t use the OS platform flock in implementing
File#flock.