Forum: Ruby-core [Backport #2488] thread usage can result in bad HANDLE

Posted by Roger Pack (Guest)
on 2009-12-16 19:37
(Received via mailing list)
Backport #2488: thread usage can result in bad HANDLE
http://redmine.ruby-lang.org/issues/show/2488

Author: Roger Pack
Status: Open, Priority: Normal

require 'thwait'
loop {
 a = []
 10.times { a << Thread.new {}}
 ThreadsWait.all_waits(a)
 print '.'
}


C:\dev\digitalarchive_trunk>ruby -v
ruby 1.9.1p376 (2009-12-07 revision 26041) [i386-mingw32]

C:\dev\digitalarchive_trunk>ruby stress_th.rb
.............[BUG] The handle is invalid.

ruby 1.9.1p376 (2009-12-07 revision 26041) [i386-mingw32]

-- control frame ----------
---------------------------
-- Ruby level backtrace 
information-----------------------------------------

[NOTE]
You may encounter a bug of Ruby interpreter. Bug reports are welcome.
For details: http://www.ruby-lang.org/bugreport.html


This application has requested the Runtime to terminate it in an unusual 
way.
Please contact the application's support team for more information.

Works fine with 1.9.2

Thanks.
-r
Posted by Usaku NAKAMURA (Guest)
on 2009-12-30 18:33
(Received via mailing list)
Issue #2488 has been updated by Usaku NAKAMURA.

Assigned to set to Usaku NAKAMURA
Target version set to 1.9.x

can reproduce with trunk.
----------------------------------------
http://redmine.ruby-lang.org/issues/show/2488
Posted by Roger Pack (Guest)
on 2010-04-22 18:24
(Received via mailing list)
Issue #2488 has been updated by Roger Pack.


appears "better" in trunk but not totally fixed, with this version:

ruby 1.9.2dev (2010-04-09 trunk 27265) [i386-mingw32]
----------------------------------------
http://redmine.ruby-lang.org/issues/show/2488
Posted by Roger Pack (Guest)
on 2010-04-22 19:19
(Received via mailing list)
Issue #2488 has been updated by Roger Pack.


from 3183 this code crashes trunk reliably still:

loop {
 all = []
 700.times{|i|
  all << Thread.new{}
   puts i
 }
 all.each(&:join)
}

----------------------------------------
http://redmine.ruby-lang.org/issues/show/2488
Posted by Luis Lavena (Guest)
on 2010-04-23 07:07
(Received via mailing list)
Issue #2488 has been updated by Luis Lavena.


Roger,

Running r27453 over there, cannot make your example crash (based on bug 
report #3183).

OS: Windows 7, x64, 4GB RAM

----------------------------------------
http://redmine.ruby-lang.org/issues/show/2488
Posted by _ wanabe (Guest)
on 2010-05-01 14:25
(Received via mailing list)
Issue #2488 has been updated by _ wanabe.

File lock_before_reset.patch added

Roger,
Does this patch fix the issue?

Developers (especially Koichi),
Is this solution right?
(Using GVL, timing of lock, condition, etc.)
----------------------------------------
http://redmine.ruby-lang.org/issues/show/2488
Posted by Roger Pack (Guest)
on 2010-05-04 23:35
(Received via mailing list)
Issue #2488 has been updated by Roger Pack.


That patch seems to fix it for me.

@Luis: I am able to reproduce #3183 (sometimes it takes a few minutes, 
and requires the puts in there).  Ex backtrace:

[BUG] w32_reset_event: The handle is invalid.

ruby 1.9.2dev (2010-05-05 trunk 27620) [i386-mingw32]

-- control frame ----------
---------------------------

[NOTE]
You may have encountered a bug in the Ruby interpreter or extension 
libraries.
Bug reports are welcome.
For details: http://www.ruby-lang.org/bugreport.html


This application has requested the Runtime to terminate it in an unusual 
way.
Please contact the application's support team for more information.

(Windows 7 32 bit, 4GB RAM).  I assume it's just a timing issue.

-rp
----------------------------------------
http://redmine.ruby-lang.org/issues/show/2488
Posted by U.Nakamura (Guest)
on 2010-05-05 07:44
(Received via mailing list)
Hello,

In message "[ruby-core:30005] [Bug #2488] thread usage can result in bad 
HANDLE"
    on May.05,2010 06:35:12, <redmine@ruby-lang.org> wrote:
| (Windows 7 32 bit, 4GB RAM).  I assume it's just a timing issue.

I think so, too.
wanabe-san, please check in the patch and close #2488 and #3183.


Regards
Posted by _ wanabe (Guest)
on 2010-05-05 13:57
(Received via mailing list)
Issue #2488 has been updated by _ wanabe.

Status changed from Assigned to Closed
% Done changed from 0 to 100

This issue was solved with changeset r27630.
Roger, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.

----------------------------------------
http://redmine.ruby-lang.org/issues/show/2488
Posted by Caleb Clausen (Guest)
on 2010-05-05 19:25
(Received via mailing list)
On 5/1/10, _ wanabe <redmine@ruby-lang.org> wrote:
> Issue #2488 has been updated by _ wanabe.
>
> File lock_before_reset.patch added

I have to admit that I've been wondering something about this patch
for days, but I haven't had the nerve to speak up, since I don't
understand win32 programming that well...

Why is it necessary to re-check that intr is set to what it was set to
just two lines before? I assume it might be changed by the action of
another thread, but if that's the case, wouldn't it be better to just
check that field once while inside the global_vm_lock? (ie move the
lock/unlock to be around the whole if statement.)

For that matter, shouldn't ruby be using the api provided by windows
(WaitForSingleObject, apparently) to check the state of this 'event
object' (really a semaphore)? If you do it that way, then maybe the
global_vm_lock doesn't need to be touched.
Posted by wanabe (Guest)
on 2010-05-09 12:03
(Received via mailing list)
Hello,

2010/5/6, Caleb Clausen <vikkous@gmail.com>:
> I don't understand win32 programming that well...

Me too.

> Why is it necessary to re-check that intr is set to what it was set to
> just two lines before? I assume it might be changed by the action of
> another thread, but if that's the case, wouldn't it be better to just
> check that field once while inside the global_vm_lock? (ie move the
> lock/unlock to be around the whole if statement.)

Yes, you are right.
I think first that it is good to avoid native_mutex_lock() as much as 
possible.
Now, I realize the condition (th && !intr) is fulfilled in a only few 
moments.

But if we delete check of outside, new check of mutex's existence will
be needed.
I'm afraid that it doesn't make sense from the standpoint of cost down.

> For that matter, shouldn't ruby be using the api provided by windows
> (WaitForSingleObject, apparently) to check the state of this 'event
> object' (really a semaphore)? If you do it that way, then maybe the
> global_vm_lock doesn't need to be touched.

I guess it is not very good idea.
Because th->native_thread_data.interrupt_event can be destroyed
by native_thread_destroy() between check and lock, but GVL can't.

And if ruby use  'event object', we should insert locking to somewhere
such as thread_cleanup_func(), but lock with GVL is in 
thread_start_func_2()
already. There is not a new cost.

But if you know the use case the lock can be harm, I'm glad to change 
it.
Posted by Caleb Clausen (Guest)
on 2010-05-09 19:17
(Received via mailing list)
On 5/9/10, wanabe <s.wanabe@gmail.com> wrote:
> moments.
>
> But if we delete check of outside, new check of mutex's existence will
> be needed.
> I'm afraid that it doesn't make sense from the standpoint of cost down.

The mutex in this case is global_vm_lock, which always should exist...
shouldn't it? I think you perhaps meant 'new check of
interrupt_event's existence'?

> such as thread_cleanup_func(), but lock with GVL is in thread_start_func_2()
> already. There is not a new cost.

My concern was that interrupt_event is already an event object... But
I misunderstood the intent of accessing the interrupt_event field in
w32_wait_events. I thought it was seeing if any event had been
signalled, but now from your comments, it's clear that it's just
seeing if the event object still exists. Thank you for helping me
understand this. My knowledge of this code is partial; please forgive
my ignorance.

> But if you know the use case the lock can be harm, I'm glad to change it.

Your changes to w32_wait_event might conflict with my attempt to make
thread priorities work again. I need to change all uses of
global_vm_lock to something else. Which is why I was trying to think
of an excuse to get rid of the extra global_vm_lock usage you had
created. Your comments have helped me understand what you did better,
now I see I was going in the wrong direction. I still have to figure
out what to do in w32_wait_event, but you've given me some other
options to think about.
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.