JRuby-Rack JMS support - RubyThread leak

Hi,

We’ve been using the JRuby-Rack JMS support in production for while
now, since JRuby 1.1.6. It’s a Rails app running in Glassfish which
additionally receives messages from ActiveMQ. It works well, but we’ve
always had a memory leak issue with it. I’d been hoping that the
problem would be fixed by e.g. JRUBY-3217, but it’s still a problem on
1.4.0, with JRuby-Rack 0.9.5 or the 0.9.6 snapshot. We’ve managed by
restarting every few weeks, but I’ve now got a minimal test case which
I believe reproduces the problem.

The heap dump from a Glassfish instance which has been running a while
has a large number of RubyThread, Frame and ThreadContext objects.
Looking at the Glassfish logs, it’s clear that the app server isn’t
just using one ActiveMQ Session thread, but is periodically starting a
new one, so over time we’re calling into a the same JRuby runtime from
a large number of different Java threads.

The test case is here:

What I’m trying to do there is make a number of trivial calls into
JRuby from each of 25000 threads, just dumping out the current thread
name. After all the threads have run to completion (verified in
jconsole), looking at the jmap histogram shows a number of classes
with counts very close to 25000 or multiples thereof:

chris-mbp-en1:~ chris$ jmap -histo:live 17183 |head -20
num #instances #bytes class name

1: 250021 20001680 org.jruby.runtime.Frame
2: 25143 4865832 [Ljava.util.WeakHashMap$Entry;
3: 29514 4319000
4: 25001 4200168 org.jruby.RubyThread
5: 50581 3641832 java.util.WeakHashMap$Entry
6: 29514 3550112
7: 25001 3400136 org.jruby.runtime.ThreadContext
8: 3537 3073384
9: 3537 2986912
10: 25001 2600184 [Lorg.jruby.runtime.Frame;
11: 25001 2600184 [Lorg.jruby.runtime.DynamicScope;
12: 25001 2600184 [Lorg.jruby.RubyModule;
13: 25001 2600104 [Lorg.jruby.RubyKernel$CatchTarget;
14: 3196 2439120 [I
15: 44920 2277280
16: 25132 2010560 org.jruby.parser.LocalStaticScope
17: 3378 1882272

I think I’m seeing the behaviour noted in JRUBY-3742, that adopted
threads aren’t automatically unregistered. Adding these lines to a
finally block in in the thread’s run method seems to do enough that
things get cleaned up properly:

RubyThread rt = (RubyThread)
runtime.getThreadService().getRubyThreadMap().get(Thread.currentThread());
runtime.getThreadService().unregisterThread(rt);

This does a bit more than is discussed in the JIRA, so I wonder if
it’d be reasonable to have a utility method somewhere that you can
call to say “I’m done with JRuby from this thread (for now)”?

I’m trying this out in JRuby-Rack, and it seems to keep the leakage
under control. I’ll report back after it’s been running a while, but
it looks like something like this is necessary for the
message-handling path, and possibly for the web-request path too.

Cheers,
Chris.


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

On Mon, Nov 16, 2009 at 8:33 AM, Chris A. [email protected] wrote:

2: 25143 4865832 [Ljava.util.WeakHashMap$Entry;
13: 25001 2600104 [Lorg.jruby.RubyKernel$CatchTarget;
RubyThread rt = (RubyThread)
message-handling path, and possibly for the web-request path too.
Is there any reason to keep these threads associated if they’re no
longer alive? If we reaped/removed dead threads from the thread
service every so often (say maybe when adopting a new thread), would
that be sufficient? Or maybe an additional background thread that gets
managed by the thread service whose sole purpose is to repeatedly join
adopted threads in a loop and reap them after they die?

/Nick


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

On Mon, Nov 16, 2009 at 4:39 PM, Nick S. [email protected]
wrote:

Is there any reason to keep these threads associated if they’re no
longer alive? If we reaped/removed dead threads from the thread
service every so often (say maybe when adopting a new thread), would
that be sufficient? Or maybe an additional background thread that gets
managed by the thread service whose sole purpose is to repeatedly join
adopted threads in a loop and reap them after they die?

Something like that would be much better than what I showed - I’m
repeatedly adopting and then “unadopting” threads, where it’d be more
efficient to leave them adopted and only clean up when they die.

I had a go at adding a background thread to reap dead threads, and
I’ve found something interesting. In ThreadService, there’s these two
WeakHashMaps:

    this.rubyThreadMap = Collections.synchronizedMap(new

WeakHashMap<Object, RubyThread>());
this.threadContextMap = Collections.synchronizedMap(new
WeakHashMap<RubyThread,ThreadContext>());

I first tried having my background thread reap from rubyThreadMap, but
the Threads were already gone, dead, and removed from the Map. The
threadContextMap was still populated though: there’s a strong
reference from the ThreadContext to its RubyThread - its key in the
Map, and that’s keeping the WeakHashMap from releasing them both. I
wonder if the same behaviour is intended, so that when the Thread is
reaped, both the RubyThread and the ThreadContext are supposed to be
reaped?

Making the “thread” attribute of ThreadContext a WeakReference seems
to be a fix here, in that it allows both the RubyThread and the
ThreadContext to be reaped when the Thread dies. I’m trying this out
now.

Chris.


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

On Mon, Nov 23, 2009 at 11:17 AM, Christian S.
[email protected] wrote:

Hi Chris,
just wondering, did you actually open a ticket?

I actually haven’t yet - my WeakReference patch isn’t suitable for
just the reason you suggested. I’ll get it opened today with or
without a fix.

The fix I’m running with now, to run a background reaper thread, does
seem to stop the leak I see.

Chris.


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

On Mon, Nov 23, 2009 at 12:11 PM, Christian S.
[email protected] wrote:

Frankly I think that it’s beyond our scope to come up with a patch in this
case. The usage of the WeakReferenceMap is simply broken IMO and the
original author of that code should have a look. Do you want to file a bug?
Otherwise I can do it as well if you prefer!

Done: JRUBY-4264

Thanks anyway!

Chris.


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email