On Mon, 6 Oct 2008, Yukihiro M. wrote:
|When I Process.fork’ed I saw this…
|evil_fork.rb:39: Mon Oct 06 14:51:19 +1300 2008 Inconsistent Mutex unlocked
|
|ie. I could be accessing $state when it is in an inconsistent state
|and the Mutex doesn’t protect me.
I am not sure what you meant here. It worked as I expected. You
didn’t wrap state(m) by synchronize, so that they are not mutually
exclusive. What did you expect out of the script?
state(m) is merely reporting the value of $state and the whether the
mutex was locked or not.
For the time $state is “Inconsistent”, the mutex should be in a locked
state. Which it is, when view by any other thread in the same
process.
However, if you fork a process, the mutex in the child process is in
the unlocked state whilst the resource is still in the inconsistent
state.
The usual pattern for a lock/unlock pair is to be wrapped round some
access to a shared resource.
In this case the shared resource is $state.
Let us make that more explicit. Suppose we are transferring money from
one account to another…
require ‘thread’
Thread.abort_on_exception = true
STDOUT.sync = true
$account_a = 100
$account_b = 100
$total = $account_a + $account_b
$mutex = Mutex.new
def log(msg,level=1)
puts “\n#{caller(0)[level]}:#{Time.now} #{msg}”
end
def invariant_check
if $total == ($account_a + $account_b)
log( “We are in a consistent state”, 2)
else
log( “We are in an inconsistent state”, 2)
end
end
def transfer( sum)
log “At the start of transaction the invariant holds $account_a +
$account_b == 200”
invariant_check
$mutex.synchronize do
log “Got lock”
$account_a = $account_a - sum
log " For the next 10 seconds we have lost money from our system.
We are inconsistent."
sleep 10
$account_b = $account_b + sum
log “Ah! Their it is again. We’re consistent again.”
end
log “Invariant holds at end”
invariant_check
end
t1 = Thread.new do
log “Sleep 4 to ensure we wait for other”
sleep 4
log “Try get lock, can’t since t2 has it. #{$mutex.locked?}”
$mutex.synchronize do
log “Only unblocks after 12 seconds into the program”
invariant_check
log “Release lock”
end
log “t1 exits”
end
sleep 1
t2 = Thread.new do
log “t2 grabs lock immediately and holds for 10”
transfer(50)
log “t2 exits”
end
sleep 1
pid = Process.fork do
log “Forked process wakes and sleeps 5”
sleep 5
log “By now t2 has the lock, but will try get it anyway”
log( “Looky the lock is free”) if !$mutex.locked?
$mutex.synchronize do
log “What! it Unblocks immediately!”
log “Announces we’re inconsistent!”
invariant_check
log “Relinquish lock”
end
log “exit process”
end
log “Wait for process”
p Process.waitpid2 pid
log “Wait for t1”
t1.join
log “Wait for t2”
t2.join
Then the output is…
ruby -w fork.rb
fork.rb:40:Mon Oct 06 17:23:25 +1300 2008 Sleep 4 to ensure we wait for
other
fork.rb:54:Mon Oct 06 17:23:26 +1300 2008 t2 grabs lock immediately and
holds for 10
fork.rb:24:in `transfer’:Mon Oct 06 17:23:26 +1300 2008 At the start of
transaction the invariant holds $account_a + $account_b == 200
fork.rb:25:in `transfer’:Mon Oct 06 17:23:26 +1300 2008 We are in a
consistent state
fork.rb:27:in `transfer’:Mon Oct 06 17:23:26 +1300 2008 Got lock
fork.rb:29:in `transfer’:Mon Oct 06 17:23:26 +1300 2008 For the next 10
seconds we have lost money from our system. We are inconsistent.
fork.rb:62:Mon Oct 06 17:23:27 +1300 2008 Forked process wakes and
sleeps 5
fork.rb:75:Mon Oct 06 17:23:27 +1300 2008 Wait for process
fork.rb:42:Mon Oct 06 17:23:29 +1300 2008 Try get lock, can’t since t2
has it. true
fork.rb:64:Mon Oct 06 17:23:32 +1300 2008 By now t2 has the lock, but
will try get it anyway
fork.rb:65:Mon Oct 06 17:23:32 +1300 2008 Looky the lock is free
fork.rb:67:Mon Oct 06 17:23:32 +1300 2008 What! it Unblocks immediately!
fork.rb:68:Mon Oct 06 17:23:32 +1300 2008 Announces we’re inconsistent!
fork.rb:69:Mon Oct 06 17:23:32 +1300 2008 We are in an inconsistent
state
fork.rb:70:Mon Oct 06 17:23:32 +1300 2008 Relinquish lock
fork.rb:72:Mon Oct 06 17:23:32 +1300 2008 exit process
[15355, #<Process::Status: pid=15355,exited(0)>]
fork.rb:78:Mon Oct 06 17:23:32 +1300 2008 Wait for t1
fork.rb:32:in `transfer’:Mon Oct 06 17:23:36 +1300 2008 Ah! Their it is
again. We’re consistent again.
fork.rb:44:Mon Oct 06 17:23:36 +1300 2008 Only unblocks after 12 seconds
into the program
fork.rb:34:in `transfer’:Mon Oct 06 17:23:36 +1300 2008 Invariant holds
at end
fork.rb:45:Mon Oct 06 17:23:36 +1300 2008 We are in a consistent state
fork.rb:35:in `transfer’:Mon Oct 06 17:23:36 +1300 2008 We are in a
consistent state
fork.rb:46:Mon Oct 06 17:23:36 +1300 2008 Release lock
fork.rb:56:Mon Oct 06 17:23:36 +1300 2008 t2 exits
fork.rb:48:Mon Oct 06 17:23:36 +1300 2008 t1 exits
fork.rb:81:Mon Oct 06 17:23:36 +1300 2008 Wait for t2
======================================================================
Where the crucial lines are…
fork.rb:65:Mon Oct 06 17:23:32 +1300 2008 Looky the lock is free
fork.rb:67:Mon Oct 06 17:23:32 +1300 2008 What! it Unblocks immediately!
fork.rb:68:Mon Oct 06 17:23:32 +1300 2008 Announces we’re inconsistent!
fork.rb:69:Mon Oct 06 17:23:32 +1300 2008 We are in an inconsistent
state
fork.rb:70:Mon Oct 06 17:23:32 +1300 2008 Relinquish lock
The solution provided by POSIX is pthread_at_fork
pthread_atfork - register handlers to be called at fork(2) time
SYNOPSIS
#include <pthread.h>
int pthread_atfork(void (*prepare)(void), void (*parent)(void),
void (*child)(void));
DESCRIPTION
"pthread_atfork" registers handler functions to be called just
before and just after a new process is created with
"fork"(2). The 'prepare' handler will be called from the parent
process, just before the new process is created. The 'parent'
handler will be called from the parent process, just before
"fork"(2) returns. The 'child' handler will be called from the
child process, just before "fork"(2) returns.
One or several of the three handlers 'prepare', 'parent' and
'child' can be given as "NULL", meaning that no handler needs
to be called at the corresponding point.
"pthread_atfork" can be called several times to install several
sets of handlers. At "fork"(2) time, the 'prepare' handlers are
called in LIFO order (last added with "pthread_atfork", first
called before "fork"), while the 'parent' and 'child' handlers
are called in FIFO order (first added, first called).
To understand the purpose of "pthread_atfork", recall that
"fork"(2) duplicates the whole memory space, including mutexes
in their current locking state, but only the calling thread:
other threads are not running in the child process. The
mutexes are not usable after the "fork" and must be iniâ€
tialized with 'pthread_mutex_init' in the child process. This
is a limitation of the current imple†mentation and might or
might not be present in future versions.
Which, in my example may grab the Mutex in the parent process for the
lifetime of the child, leaving it unlocked in the child process.
John C. Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : [email protected]
New Zealand