1.9.x Thread problem

Hi All,

Here’s a tricky one for you:

require ‘thread’

THREADS = 10
WASTE = 10000

pid = fork() do
queue = Queue.new
THREADS.times do |i|
queue << Thread.new do
# wasting actual cpu time, rather than sleeping
Thread.current[:index] = i
n = rand(WASTE)
Thread.current[:n] = n
d = 1.0 + 1/n.to_f
e = 1.0
n.times { e = e * d }
end
end

THREADS.times do
th = queue.pop
puts “Joining thread #{th[:index]}, n=#{th[:n]}”
th.join()
end

end

Process.wait(pid)

If I run this using ruby 1.8.7 (2008-08-11 patchlevel 72) [i486-linux] I
get the output I would expect:
Joining thread 0, n=8767
Joining thread 1, n=2401
Joining thread 2, n=514
Joining thread 3, n=5227
Joining thread 4, n=658
Joining thread 5, n=8618
Joining thread 6, n=2356
Joining thread 7, n=7431
Joining thread 8, n=8426
Joining thread 9, n=588

If I run this using either ruby 1.9.1p0 (2009-01-30 revision 21907)
[i686-linux] or ruby 1.9.2dev (2009-03-05 trunk 22769) [i686-linux] I
get results I would not expect:
Joining thread , n=
Joining thread 1, n=1131
Joining thread 2, n=8233
Joining thread 3, n=8696
Joining thread 4, n=7684
Joining thread 5, n=286
Joining thread 6, n=6618
Joining thread 7, n=2814
Joining thread 8, n=5862
Joining thread 9, n=6307

My best guess is that something similar to the fix for this bug
http://redmine.ruby-lang.org/issues/show/657 is happening, but elsewhere
in the code. Does anyone have any ideas?

=======================================================================
This email, including any attachments, is only for the intended
addressee. It is subject to copyright, is confidential and may be
the subject of legal or other privilege, none of which is waived or
lost by reason of this transmission.
If the receiver is not the intended addressee, please accept our
apologies, notify us by return, delete all copies and perform no
other act on the email.
Unfortunately, we cannot warrant that the email has not been
altered or corrupted during transmission.

Hi,

At Mon, 9 Mar 2009 09:25:06 +0900,
Michael M. wrote in [ruby-talk:330697]:

If I run this using either ruby 1.9.1p0 (2009-01-30 revision 21907)
[i686-linux] or ruby 1.9.2dev (2009-03-05 trunk 22769) [i686-linux] I
get results I would not expect:
Joining thread , n=

It hasn’t been defined which thread, the created child thread
or the creating parent thread, runs first at thread creation.
In 1.8, the child runs first by chance.

My best guess is that something similar to the fix for this bug
http://redmine.ruby-lang.org/issues/show/657 is happening, but elsewhere
in the code. Does anyone have any ideas?

No, it is irrelevant.

Nobuyoshi N. wrote:

It hasn’t been defined which thread, the created child thread
or the creating parent thread, runs first at thread creation.
In 1.8, the child runs first by chance.

Can you then explain why it runs consistently like this or a way in
which I can guarantee I am running in the first created child thread? I
would have expected to be inside of the child thread because the code
setting up the :index is inside the block passed to Thread.new(). And I
only put the created child Thread instances in the Queue, so I should
only get child Thread instances when I call Queue#pop.

=======================================================================
This email, including any attachments, is only for the intended
addressee. It is subject to copyright, is confidential and may be
the subject of legal or other privilege, none of which is waived or
lost by reason of this transmission.
If the receiver is not the intended addressee, please accept our
apologies, notify us by return, delete all copies and perform no
other act on the email.
Unfortunately, we cannot warrant that the email has not been
altered or corrupted during transmission.

On Sun, 08 Mar 2009 21:05:58 -0500, Michael M. wrote:

only put the created child Thread instances in the Queue, so I should
only get child Thread instances when I call Queue#pop.

I’m really not sure what you’re asking. All you get from Queue#pop is
child threads. The thing is that the operating system decides (and it is
quite out of your control) which thread will have a chance to execute
next. So apparently Ruby 1.8 always chose to let the child threads have
a
chance before you reached the THREADS.times loop. Ruby 1.9, on the other
hand isn’t giving the child threads a chance to run any code until you
hit the th.join() call. Thus, when you try to access th[:index] and th
[:n], the child threads haven’t had a chance to update them yet. Once
you
run thread 0, that takes long enough that the other threads also get a
chance to execute and set values for th[:index] and th[:n], and that’s
why those values are available.

But there’s no guarantees about who gets to execute when, and you
shouldn’t count on any. This is not a bug.

Unfortunately, I don’t have any ideas about a good way to ensure that
the
threads execute far enough to have set th[:index] and th[:n] before you
print them.

I have read and re-read this about 10 times now and this comment makes
it appear that the contract for Thread.new {block}, where the block is
run exclusively in the newly created thread, is violated.

Can you then explain why it runs consistently like this or a way in
which I can guarantee I am running in the first created child thread?
I would have expected to be inside of the child thread because the
code setting up the :index is inside the block passed to
Thread.new(). And I only put the created child Thread instances in
the Queue, so I should only get child Thread instances when I call
Queue#pop.
require ‘thread’
THREADS = 10

parent_id = Thread.current.object_id

queue = Queue.new
THREADS.times do |i|
queue << Thread.new do
th = Thread.current
th[:index] = i
raise “Running inside parent” if th.object_id == parent_id
sleep(2)
end
end

THREADS.times do
th = queue.pop
raise “Running inside parent” if th.object_id == parent_id
puts “Joining thread #{th[:index]}”
th.join()
end

I have simplified and changed things. I am quite certain that the
parent thread is not being joined or indeed added to the queue as no
exceptions are raised (unless this is a bad test) and I still get the
same (strange) output. Any other ideas?

=======================================================================
This email, including any attachments, is only for the intended
addressee. It is subject to copyright, is confidential and may be
the subject of legal or other privilege, none of which is waived or
lost by reason of this transmission.
If the receiver is not the intended addressee, please accept our
apologies, notify us by return, delete all copies and perform no
other act on the email.
Unfortunately, we cannot warrant that the email has not been
altered or corrupted during transmission.

Here is a bare solution using a mutex and a counter. The next thing
would be to use a condition variable to put the main thread to sleep
while the child threads initialize, rather than spinning wheels /
burning rubber while waiting.

Also look into monitor.rb, sync.rb, and producer/consumer strategies.

require ‘thread’
THREADS = 10

parent_id = Thread.current.object_id

mutex = Mutex.new
num_threads_ready = 0

queue = Queue.new
THREADS.times do |i|
queue << Thread.new do
th = Thread.current
th[:index] = i
mutex.synchronize { num_threads_ready += 1 } # <-----
raise “Running inside parent” if th.object_id == parent_id
sleep(2)
end
end

wait until threads have initialized

until mutex.synchronize { num_threads_ready == THREADS }
Thread.pass
end

THREADS.times do
th = queue.pop
raise “Running inside parent” if th.object_id == parent_id
puts “Joining thread #{th[:index]}”
th.join()
end

chance to execute and set values for th[:index] and th[:n], and that’s
why those values are available.

But there’s no guarantees about who gets to execute when, and you
shouldn’t count on any. This is not a bug.

Unfortunately, I don’t have any ideas about a good way to ensure that the
threads execute far enough to have set th[:index] and th[:n] before you
print them.

Ahhh, now I understand and yes, you’re quite right. Sigh. If I join
the thread before printing the index, then everything works as
expected. Back to square one with my other problem… But I haven’t
come up with something simple that displays the right behaviour, so I’ll
give that a bit more thought before I post that question. Thanks for
your help.

=======================================================================
This email, including any attachments, is only for the intended
addressee. It is subject to copyright, is confidential and may be
the subject of legal or other privilege, none of which is waived or
lost by reason of this transmission.
If the receiver is not the intended addressee, please accept our
apologies, notify us by return, delete all copies and perform no
other act on the email.
Unfortunately, we cannot warrant that the email has not been
altered or corrupted during transmission.