Re: JRuby-Rack JMS support - RubyThread leak

caefer · November 17, 2009, 4:12pm

Hi,
I spent four hours on Saturday to debug the very same leak… For now I
patched JRuby, so it ditches terminated threads from the
threadContextMap every once in a while. The leak seems to be fixed. Yes,
I also noticed that ThreadContext refers back to RubyThread, which may
prevent the stuff from being GCed. Is this a pretty nasty JRuby bug?

Here’s how the threads get registered by JRuby-Rack/JMS:
org.jruby.internal.runtime.ThreadService.registerNewThread(ThreadService.java:197)
org.jruby.RubyThread.adoptThread(RubyThread.java:226)
org.jruby.RubyThread.adopt(RubyThread.java:218)
org.jruby.internal.runtime.ThreadService.adoptCurrentThread(ThreadService.java:134)
org.jruby.internal.runtime.ThreadService.getCurrentContext(ThreadService.java:116)
org.jruby.Ruby.getCurrentContext(Ruby.java:2185)
org.jruby.gen.InterfaceImpl834654122.getApplication(org/jruby/gen/InterfaceImpl834654122.gen:12)
org.jruby.rack.jms.DefaultQueueManager$RubyObjectMessageListener.onMessage(DefaultQueueManager.java:127)
org.apache.activemq.ActiveMQMessageConsumer.dispatch(ActiveMQMessageConsumer.java:1021)
org.apache.activemq.ActiveMQSessionExecutor.dispatch(ActiveMQSessionExecutor.java:122)
org.apache.activemq.ActiveMQSessionExecutor.iterate(ActiveMQSessionExecutor.java:192)
org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:122)
org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:43)
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
java.lang.Thread.run(Thread.java:637)

Christian

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

caefer · November 17, 2009, 4:57pm

On Tue, Nov 17, 2009 at 3:11 PM, Christian S. [email protected]
wrote:

Hi,
I spent four hours on Saturday to debug the very same leak… For now I
patched JRuby, so it ditches terminated threads from the threadContextMap
every once in a while. The leak seems to be fixed. Yes, I also noticed that
ThreadContext refers back to RubyThread, which may prevent the stuff from
being GCed. Is this a pretty nasty JRuby bug?

Ah, I’m glad someone else is seeing this issue!

I was going by this section in the WeakHashMap docs:

Implementation note: The value objects in a WeakHashMap are held by ordinary
strong references. Thus care should be taken to ensure that value objects do not
strongly refer to their own keys, either directly or indirectly, since that will prevent
the keys from being discarded.

which seems to be exactly what we have here. That said, it’s probably
quite an uncommon case - we have lots of threads showing up from
elsewhere, briefly making use of JRuby and then disappearing again of
their own accord.

I’m running with this change at the moment:

Make ThreadContext's thread attribute be a WeakReference, so that · chrisa/jruby@30c26f5 · GitHub

to just wrap the thread reference in a WeakReference, which seems to
have the same effect as forcibly removing things from the
threadContextMap.

Chris.

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

caefer · November 17, 2009, 5:08pm

Chris A. wrote:

I was going by this section in the WeakHashMap docs:

Implementation note: The value objects in a WeakHashMap are held by ordinary
strong references. Thus care should be taken to ensure that value objects do not
strongly refer to their own keys, either directly or indirectly, since that will prevent
the keys from being discarded.

which seems to be exactly what we have here. That said, it’s probably
quite an uncommon case - we have lots of threads showing up from
elsewhere, briefly making use of JRuby and then disappearing again of
their own accord.

org.jruby.internal.runtime.RubyRunnable actually does call the
unregister method, but in our case (JMS / JRuby-Rack) the call stack is
different (see my last mail). I don’t know if this is an uncommon case
though. It’s probably required that the WeakHashMap does its magic here

which is cannot because of that back reference.

I think somebody from JRuby core has to take a look.

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

caefer · November 17, 2009, 6:24pm

On Tue, Nov 17, 2009 at 4:07 PM, Christian S. [email protected]
wrote:

because of that back reference.

I think somebody from JRuby core has to take a look.

Yep - I’ll summarise what we’ve found in a new thread, since this
doesn’t seem to be an issue specific to JRuby-Rack.

Chris.

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

caefer · November 17, 2009, 8:03pm

On Tue, Nov 17, 2009 at 9:56 AM, Chris A. [email protected] wrote:

I was going by this section in the WeakHashMap docs:

I’m running with this change at the moment:

Make ThreadContext's thread attribute be a WeakReference, so that · chrisa/jruby@30c26f5 · GitHub

to just wrap the thread reference in a WeakReference, which seems to
have the same effect as forcibly removing things from the
threadContextMap.

This looks pretty good to me. Would you mind posting this as a patch
made with git format-patch to JIRA (as I mentioned in a reply to the
other thread). Looks like something we should run with in trunk for a
while to shake out any possible bugs.

/Nick

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

caefer · November 18, 2009, 11:50am

On Wed, Nov 18, 2009 at 10:20 AM, Christian S.
[email protected] wrote:

threadContextMap.

Don’t you have to check for null every time you access the WeakReference?

It appears so:

[#|2009-11-18T10:44:09.582+0000|SEVERE|sun-appserver2.1|javax.enterprise.system.container.web|_ThreadID=18;_ThreadName=httpSSLWorkerThread-52396-0;_RequestID=451676b1-0aef-4f35-9b9a-4b3d67045f60;|WebModule[]Application
Error
java.lang.NullPointerException
at
org.jruby.runtime.ThreadContext.pollThreadEvents(ThreadContext.java:498)
at org.jruby.ast.CaseNode.interpret(CaseNode.java:123)
at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:104)
at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
at
org.jruby.runtime.InterpretedBlock.evalBlockBody(InterpretedBlock.java:317)
…

I’m not sure what the right thing to do is though, if the thread was
null. Perhaps forcibly reaping dead threads from a background thread
is the more appropriate fix here.

Chris.

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

caefer · November 18, 2009, 11:21am

Nick S. wrote:

I was going by this section in the WeakHashMap docs:
I’m running with this change at the moment:
while to shake out any possible bugs.

/Nick

Don’t you have to check for null every time you access the
WeakReference?

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email