All threads deadlocked in JIT

After about a day or so of running, all of the threads within my
trinidad sinatra app get stuck waiting on a lock in what looks like to
be the JIT compiler. I am using:

jruby 1.5.3

java version “1.6.0_18”
OpenJDK Runtime Environment (IcedTea6 1.8.1) (6b18-1.8.1-0ubuntu1)
OpenJDK 64-Bit Server VM (build 16.0-b13, mixed mode)

Ubuntu 10.04 LTS

All of the threads are stuck at:

java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00007f7cc444d660> (a
org.jruby.libraries.ThreadLibrary$ConditionVariable)
at java.lang.Object.wait(Object.java:502)
at org.jruby.RubyThread.wait_timeout(RubyThread.java:1032)
at
org.jruby.libraries.ThreadLibrary$ConditionVariable.wait_ruby(ThreadLibrary.java:252)
- locked <0x00007f7cc444d660> (a
org.jruby.libraries.ThreadLibrary$ConditionVariable)
at
org.jruby.libraries.ThreadLibrary$ConditionVariable$i_method_0_1$RUBYINVOKER$wait_ruby.call(org/jruby/libraries/ThreadLibrary$ConditionVariable$i_method_0_1$RUBYINVO
KER$wait_ruby.gen:65535) at
org.jruby.internal.runtime.methods.JavaMethod$JavaMethodN.call(JavaMethod.java:630)
at
org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:186)
at
org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:309)
at
org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:148)
at
ruby.jit.mon_wait_for_cond_487CFC3D3C4B1036F82C96FDEDC17DEDB3B49D48.ensure_1$RUBY$__ensure___0(monitor.rb:266)
at
ruby.jit.mon_wait_for_cond_487CFC3D3C4B1036F82C96FDEDC17DEDB3B49D48.block_0$RUBY$block(monitor.rb:261)
at
ruby.jit.mon_wait_for_cond_487CFC3D3C4B1036F82C96FDEDC17DEDB3B49D48BlockCallback$block_0$RUBY$__block__xx1.call(Unknown
Source)
at org.jruby.runtime.CompiledBlock.yield(CompiledBlock.java:115)
at org.jruby.runtime.CompiledBlock.yield(CompiledBlock.java:98)
at org.jruby.runtime.Block.yield(Block.java:194)
at
org.jruby.libraries.ThreadLibrary$Mutex.synchronize(ThreadLibrary.java:195)
at
org.jruby.libraries.ThreadLibrary$Mutex$i_method_0_0$RUBYINVOKER$synchronize.call(org/jruby/libraries/ThreadLibrary$Mutex$i_method_0_0$RUBYINVOKER$synchronize.gen:65
535)
at
org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:115)
at
org.jruby.runtime.callsite.CachingCallSite.callIter(CachingCallSite.java:132)
at
ruby.jit.mon_wait_for_cond_487CFC3D3C4B1036F82C96FDEDC17DEDB3B49D48.file(monitor.rb:256)
at
ruby.jit.mon_wait_for_cond_487CFC3D3C4B1036F82C96FDEDC17DEDB3B49D48.file(monitor.rb)
at
org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:187)
at
org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:329)
at
org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:188)
at
ruby.jit.wait_98EE503714D938C1647978AA5C90B4BCB8092DD0.block_0$RUBY$block(monitor.rb:95)
at
ruby.jit.wait_98EE503714D938C1647978AA5C90B4BCB8092DD0BlockCallback$block_0$RUBY$__block__xx1.call(Unknown
Source)
at org.jruby.runtime.CompiledBlock.yield(CompiledBlock.java:136)
at org.jruby.runtime.CompiledBlock.yield(CompiledBlock.java:102)
at org.jruby.runtime.Block.yieldNonArray(Block.java:205)
at org.jruby.RubyObject.yieldUnder(RubyObject.java:507)
at org.jruby.RubyObject.specificEval(RubyObject.java:331)
at org.jruby.RubyObject.instance_eval(RubyObject.java:1285)
at
org.jruby.RubyObject$i_method_multi$RUBYFRAMEDINVOKER$instance_eval.call(org/jruby/RubyObject$i_method_multi$RUBYFRAMEDINVOKER$instance_eval.gen:65535)
at
org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:299)
at
org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:117)
at
org.jruby.runtime.callsite.CachingCallSite.callIter(CachingCallSite.java:132)
at
ruby.jit.wait_98EE503714D938C1647978AA5C90B4BCB8092DD0.file(monitor.rb:95)
at
ruby.jit.wait_98EE503714D938C1647978AA5C90B4BCB8092DD0.file(monitor.rb)
at
org.jruby.ast.executable.AbstractScript.file(AbstractScript.java:35)
at
org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:119)
at
org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:106)
at
ruby.jit.get_tagged_response_80FD41CCC14D1349C5D5092C9836780CB372BED6.file(imap_27688.rb:74)

Any ideas?

thx.

-karl

Hi,

On Thu, Oct 28, 2010 at 19:30, Karl B. [email protected] wrote:

All of the threads are stuck at:

at
ruby.jit.wait_98EE503714D938C1647978AA5C90B4BCB8092DD0.file(monitor.rb:95)
at
ruby.jit.wait_98EE503714D938C1647978AA5C90B4BCB8092DD0.file(monitor.rb)
at org.jruby.ast.executable.AbstractScript.file(AbstractScript.java:35)
at
org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:119)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:106)
at
ruby.jit.get_tagged_response_80FD41CCC14D1349C5D5092C9836780CB372BED6.file(imap_27688.rb:74)

Is it all backtraces you get? It seems to be waiting
ConditionalVariable in monitor which is used in get_tagged_response
method in imap.rb (line 994). It might be happened when an IMAP client
waits the server responding I think.

I don’t know IMAP programming well but it might help you that checking
IMAP resource management part and tracing how IMAP connection are
handled by logging.

Regards,
// NaHi

Hi. These are indeed all of the backtraces I get.

I was initially thinking that this might be related to JIT as i see a
bunch of JIT methods in the trace and it only starts happening after the
server has been up for a while. I guess this is not the case?

Thanks for your help!

-karl

Hi,

On Thu, Oct 28, 2010 at 22:08, Karl B. [email protected] wrote:

Hi. These are indeed all of the backtraces I get.

Hmm.

I was initially thinking that this might be related to JIT as i see a bunch of
JIT methods in the trace and it only starts happening after the server has been up
for a while. I guess this is not the case?

Bunch of JIT things in backtraces are normal. (Though it might be
changed in 1.6)
Yes, I think it’s not related to JIT.

Regards,
// NaHi

Hi Eagle. The line in question is:

def get_tagged_response(tag)
until @tagged_responses.key?(tag)
raise @exception if @exception
@response_arrival.wait
end
return pick_up_tagged_response(tag)
end

The reason why it is not coming from the imap library is i have applied
the following patch:

http://redmine.ruby-lang.org/repositories/diff/ruby-18?rev=27688

As a result of this issue:

http://redmine.ruby-lang.org/issues/show/3191

That being said, i don’t know if applying that patch is in any way
related to the issue i am running into now.

What is the best way to debug this?

Thanks for your help!

-karl

To echo Hiroshi, it looks like your methods have already been compiled,
and you’re waiting on something else[1]. The line in question appears to
be (from a bit of random googling, I’m not familiar with the code at
all) getting a line from an incoming socket, which would indicate that
your app is sitting patiently waiting for input (a normal state for a
Sinatra app). Are the threads just waiting, or does your app become
unresponsive?

[1] Worth noting, I think, is that you’re waiting, rather than
deadlocked - There’s no lock contention, you just have a bunch of
threads awaiting an event that would indicate that there is work to do.
It’s true that they may never emerge from that state if they are never
notified, but it’s a very different situation from being deadlocked.

Update to this, it looks like half of the threads are actually hanging
in the trace below. Could this have to do with java nio not timing out?

“Thread-9802” daemon prio=10 tid=0x0000000003e71000 nid=0x5e78 runnable
[0x00007f3f6e9c6000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
at
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:83)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x00007f3ff95fbe48> (a sun.nio.ch.Util$1)
- locked <0x00007f3ff95fbe30> (a
java.util.Collections$UnmodifiableSet)
- locked <0x00007f3ff95fba70> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:102)
at
org.jruby.ext.openssl.SSLSocket.waitSelect(SSLSocket.java:240)
at org.jruby.ext.openssl.SSLSocket.sysread(SSLSocket.java:448)
at
org.jruby.ext.openssl.SSLSocket$i_method_0_1$RUBYINVOKER$sysread.call(org/jruby/ext/openssl/SSLSocket$i_method_0_1$RUBYINVOKER$sysread.gen:65535)
at
org.jruby.internal.runtime.methods.JavaMethod$JavaMethodN.call(JavaMethod.java:630)
at
org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:186)
at
org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:146)
at
ruby.jit.fill_rbuff_0CEA65EC1C909A5CF6CB7E8100C08A3F334C6007.rescue_1$RUBY$__rescue___0(buffering.rb:35)
at
ruby.jit.fill_rbuff_0CEA65EC1C909A5CF6CB7E8100C08A3F334C6007.file(buffering.rb:34)
at
ruby.jit.fill_rbuff_0CEA65EC1C909A5CF6CB7E8100C08A3F334C6007.file(buffering.rb)
at
org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:119)
at
org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:106)
at
ruby.jit.gets_78EF17A1CC25A7841D76040F4F007ECC23A22E41.file(buffering.rb:106)
at
ruby.jit.gets_78EF17A1CC25A7841D76040F4F007ECC23A22E41.file(buffering.rb)
at
org.jruby.ast.executable.AbstractScript.file(AbstractScript.java:39)
at
org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:153)
at
org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:309)
at
org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:148)
at
ruby.jit.get_response_9C1FCDC0EA0182083B570911F4E3FECA09F08611.file(imap.rb:994)
at
ruby.jit.get_response_9C1FCDC0EA0182083B570911F4E3FECA09F08611.file(imap.rb)
at
org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:119)
at
org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:289)
at
org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:108)
at org.jruby.ast.VCallNode.interpret(VCallNode.java:85)
at org.jruby.ast.LocalAsgnNode.interpret(LocalAsgnNode.java:123)
at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:104)
at org.jruby.ast.RescueNode.executeBody(RescueNode.java:199)
at
org.jruby.ast.RescueNode.interpretWithJavaExceptions(RescueNode.java:118)
at org.jruby.ast.RescueNode.interpret(RescueNode.java:110)
at org.jruby.ast.BeginNode.interpret(BeginNode.java:83)
at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:104)
at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
at org.jruby.ast.WhileNode.interpret(WhileNode.java:131)
at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:104)
at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
at
org.jruby.internal.runtime.methods.InterpretedMethod.call(InterpretedMethod.java:139)
at
org.jruby.internal.runtime.methods.DefaultMethod.call(DefaultMethod.java:159)
at
org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:106)
at org.jruby.ast.VCallNode.interpret(VCallNode.java:85)
at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:104)
at
org.jruby.runtime.InterpretedBlock.evalBlockBody(InterpretedBlock.java:373)
at
org.jruby.runtime.InterpretedBlock.yield(InterpretedBlock.java:327)
at org.jruby.runtime.BlockBody.call(BlockBody.java:78)
at org.jruby.runtime.Block.call(Block.java:89)
at org.jruby.RubyProc.call(RubyProc.java:224)
at org.jruby.RubyProc.call(RubyProc.java:207)
at
org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:94)
at java.lang.Thread.run(Thread.java:636)

Thanks!

Hi,

Oh, SSL. And I remember JRUBY-5015 & JRUBY-5019.

On Fri, Oct 29, 2010 at 06:44, Karl B. [email protected] wrote:

Update to this, it looks like half of the threads are actually hanging in
the trace below. Could this have to do with java nio not timing out?

The stack trace looks that the client is waiting for the socket
because the server does not send any data.

Let’s pull things together.

  • Your web application running on Trinidad connects to imap server
    over imaps, right?
  • Your application runs normally for a day or so, and then what’s
    happened? IMAP connection starts failing?
  • Can you show us a small IMAP client script excerpted from your
    application?

Regards,
// NaHi