Bug #2739: ruby 1.8.7 built with pthreads hangs under some circumstances http://redmine.ruby-lang.org/issues/show/2739 Author: Joel Ebel Status: Open, Priority: Normal ruby -v: ruby 1.8.7 (2010-01-10 patchlevel 249) [x86_64-linux] Ruby 1.8.7 built with pthreads is hanging for me. I can't produce a reproducible testcase, and the problem is intermittent for me as it is, but I have traced it back to a particular patch when it began. The hang happens on an exec, where the ruby process clones itself, and the clone hangs. If I build ruby without pthreads it works fine. Specifically this is happening on a run of puppet when it is loading facts. Going back through versions of the 1.8.7 branch, it appears the problem began happening in patchlevel 183 (svn revision 24104) If I try the 1.8 branch, problems begin happening with svn revision 23268 and become more like the current behavior with revision 23305, both of which were merged into 1.8.7 in patchlevel 183 (r 24104) If I try newer versions of the 1.8 branch, I find that the syntax has changed, however, it's possible that the specific problem i'm experiencing is fixed in r 24400 and/or 24402 (24400 doesn't build for me, so I can't be sure which revision is responsible for the improved behavior. I will continue trying to create a reproducible test case for this bug, but I hoped that narrowing down where the regression begins would be a helpful place to start.
on 2010-02-11 22:54
on 2010-03-01 23:32
Issue #2739 has been updated by Lucas Nussbaum.
After more investigation (see
https://bugs.launchpad.net/ubuntu/+source/ruby1.8/+bug/520715 for the
details), here are some conclusions.
Using this test case:
<------------------
#!/usr/bin/ruby1.8
%x{/usr/bin/touch /tmp/7777}
puts "executed without timeout ok"
puts "executing with timeout"
require 'timeout'
status = Timeout::timeout(5) {
%x{/usr/bin/touch /tmp/7777}
}
puts "executed with timeout ok"
--------------------------->
The above test case:
- runs fine on Debian unstable (using GLIBC 2.10)
- hangs on Debian unstable using the GLIBC packages from Debian
experimental, version 2.11.0
- hangs on Ubuntu Lucid (which GLIBC 2.11.0)
Both Debian unstable and Ubuntu lucid use Ruby 1.8.7 (2010-01-10
patchlevel 249)
By "hangs", I mean:
$ while ruby1.8 te.rb ; do true; done
executed without timeout ok
executing with timeout
executed with timeout ok
executed without timeout ok
executing with timeout
/usr/lib/ruby/1.8/timeout.rb:60: execution expired (Timeout::Error)
from te.rb:11
It is not clear whether this is a GLIBC or a Ruby issue. However, it
would be fantastic if a Ruby developer with insight in the Ruby
threading code could take a look.
----------------------------------------
http://redmine.ruby-lang.org/issues/show/2739
on 2010-03-06 09:28
Issue #2739 has been updated by Alex Legler. If it's any help: I can confirm this issue on Gentoo as well. After 10-200 iterations, the timeout occurs. glibc 2.11, ruby 1.8.7 (2010-01-10 patchlevel 249) [x86_64-linux] On a machine with glibc 2.9 on the other hand, I can run the reproducer for minutes w/o any failure. ---------------------------------------- http://redmine.ruby-lang.org/issues/show/2739
on 2010-03-06 12:12
Issue #2739 has been updated by Motohiro KOSAKI. Hi I'm very glad to your help. if you can reproduce this issue easily, can you please get stacktrace-info and give it us? I think we can use pstack command. % pstack [pid-of-ruby] Thanks. ---------------------------------------- http://redmine.ruby-lang.org/issues/show/2739
on 2010-03-06 18:26
Issue #2739 has been updated by Lucas Nussbaum. Ruby is compiled with pthreads enabled on Ubuntu (and Debian), so there are several PIDs of interest here. Backtraces for the parent PID: #0 0x00007f508e929c73 in select () from /lib/libc.so.6 #1 0x00007f508f6f2893 in rb_thread_schedule () from /usr/lib/libruby1.8.so.1.8 #2 0x00007f508f709a3c in ?? () from /usr/lib/libruby1.8.so.1.8 #3 0x00007f508f70e6d3 in ?? () from /usr/lib/libruby1.8.so.1.8 #4 0x00007f508f6ef6c1 in ?? () from /usr/lib/libruby1.8.so.1.8 #5 0x00007f508f6ef8b3 in ?? () from /usr/lib/libruby1.8.so.1.8 #6 0x00007f508f6f0578 in ?? () from /usr/lib/libruby1.8.so.1.8 #7 0x00007f508f6f0825 in rb_funcall () from /usr/lib/libruby1.8.so.1.8 #8 0x00007f508f6ebd7d in ?? () from /usr/lib/libruby1.8.so.1.8 #9 0x00007f508f6ed9f7 in ?? () from /usr/lib/libruby1.8.so.1.8 #10 0x00007f508f6e9dea in ?? () from /usr/lib/libruby1.8.so.1.8 #11 0x00007f508f6ecb81 in ?? () from /usr/lib/libruby1.8.so.1.8 #12 0x00007f508f6ecc5b in ?? () from /usr/lib/libruby1.8.so.1.8 #13 0x00007f508f6ef573 in ?? () from /usr/lib/libruby1.8.so.1.8 #14 0x00007f508f6ef8b3 in ?? () from /usr/lib/libruby1.8.so.1.8 #15 0x00007f508f6ec721 in ?? () from /usr/lib/libruby1.8.so.1.8 #16 0x00007f508f6ed066 in ?? () from /usr/lib/libruby1.8.so.1.8 #17 0x00007f508f6ea2f6 in ?? () from /usr/lib/libruby1.8.so.1.8 #18 0x00007f508f6fc85b in ?? () from /usr/lib/libruby1.8.so.1.8 #19 0x00007f508f6fc8a5 in ruby_exec () from /usr/lib/libruby1.8.so.1.8 #20 0x00007f508f6fc8d5 in ruby_run () from /usr/lib/libruby1.8.so.1.8 #21 0x0000000000400911 in main () Backtrace for the child PID: #0 0x00007f508f4a2474 in __lll_lock_wait () from /lib/libpthread.so.0 #1 0x00007f508f4a00c1 in pthread_cond_signal@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #2 0x00007f508f6e5e8e in rb_thread_stop_timer () from /usr/lib/libruby1.8.so.1.8 #3 0x00007f508e8f5416 in fork () from /lib/libc.so.6 #4 0x00007f508f70cd20 in ?? () from /usr/lib/libruby1.8.so.1.8 #5 0x00007f508f70e691 in ?? () from /usr/lib/libruby1.8.so.1.8 #6 0x00007f508f6ef6c1 in ?? () from /usr/lib/libruby1.8.so.1.8 #7 0x00007f508f6ef8b3 in ?? () from /usr/lib/libruby1.8.so.1.8 #8 0x00007f508f6f0578 in ?? () from /usr/lib/libruby1.8.so.1.8 #9 0x00007f508f6f0825 in rb_funcall () from /usr/lib/libruby1.8.so.1.8 #10 0x00007f508f6ebd7d in ?? () from /usr/lib/libruby1.8.so.1.8 #11 0x00007f508f6ed9f7 in ?? () from /usr/lib/libruby1.8.so.1.8 #12 0x00007f508f6e9dea in ?? () from /usr/lib/libruby1.8.so.1.8 #13 0x00007f508f6ecb81 in ?? () from /usr/lib/libruby1.8.so.1.8 #14 0x00007f508f6ecc5b in ?? () from /usr/lib/libruby1.8.so.1.8 #15 0x00007f508f6ef573 in ?? () from /usr/lib/libruby1.8.so.1.8 #16 0x00007f508f6ef8b3 in ?? () from /usr/lib/libruby1.8.so.1.8 #17 0x00007f508f6ec721 in ?? () from /usr/lib/libruby1.8.so.1.8 #18 0x00007f508f6ed066 in ?? () from /usr/lib/libruby1.8.so.1.8 #19 0x00007f508f6ea2f6 in ?? () from /usr/lib/libruby1.8.so.1.8 #20 0x00007f508f6fc85b in ?? () from /usr/lib/libruby1.8.so.1.8 #21 0x00007f508f6fc8a5 in ruby_exec () from /usr/lib/libruby1.8.so.1.8 #22 0x00007f508f6fc8d5 in ruby_run () from /usr/lib/libruby1.8.so.1.8 #23 0x0000000000400911 in main () I could easily provide you with an Ubuntu lucid chroot (as a tarball) so you can reproduce the issue. I'd just need to use the CPU architecture that you use (i386, amd64?) ---------------------------------------- http://redmine.ruby-lang.org/issues/show/2739
Please log in before posting. Registration is free and takes only a minute.
Existing account
(Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
Log in with Google account | Log in with Yahoo account
No account? Register here.