On Tue, Dec 13, 2011 at 8:03 PM, Chuck R. [email protected]
wrote:
Three million new classes created. I’m sure it’s singleton somewhere,
but that’s often a big perf sink…not necessarily because it impacts
straight-line code a lot, but because a class object is a nontrivial
thing to be creating three million of.
Is there a way for JRuby to output more detail on those classes? For example,
can it tell me how many of each class type have been allocated? If so, I could use
that information to figure out the critical path and try to reduce the amount of
class creation there.
…
Also, I’m not certain what you mean by “I’m sure it’s singleton somewhere.” I
know what a singleton is, but I don’t understand your use of it in this context.
Elaborate, please.
Actually, it may not be a singleton object/class, now that I think
about it. Someone, somewhere is doing a Class.new { } (or similar) in
critical path code.
I’ll take a shot at reducing my object allocation. More information on how many
of each type of object is being allocated would be a wonderful statistic to have.
Also, if there were a way to trace that allocation back to a particular file and
line number (for those cases where the same object is allocated in multiple
places), that would be a killer profiling feature. <hint, hint>
Indeed it would. Luckily, the JVM already provides it.
It will be very slow, but Sun/Oracle/OpenJDK has an object allocation
profiler built in. Examples:
jruby -J-Xrunhprof … # run with object allocation profiling,
tracking allocation call stacks 5 deep
jruby -J-Xrunhprof:depth=10 … # … call stacks 10 deep
Running java -Xrunhprof:help will provide full help. As I say, it will
be slow but you don’t need to run a long benchmark; you’re looking
for aggregate numbers of objects for some number of iterations, which
won’t change drastically over a long run. (There’s also a full
instrumented performance profiler in “runhprof”, but it’s dead slow,
because it instruments every method in the system, including every JDK
class.)
The object allocation profiles will be dumped to a gigantic log file
called java.hprof.txt. The primary interesting bit of information is
at the end…a list of object types allocated plus numeric references
to the stack trace where they’re allocated. Another example:
Command:
$ jruby -J-Xrunhprof:depth=20 -e “1_000.times { ‘foo’ + ‘bar’ }”
A few elements of interest:
12 0.39% 14.11% 32000 1000 32000 1000 336225
org.jruby.RubyString
13 0.39% 14.51% 32000 1000 32000 1000 336226
org.jruby.RubyString
14 0.39% 14.90% 31968 999 31968 999 336239
org.jruby.RubyString
(Most of the top 20 allocated elements are byte[], which doesn’t tell
you much)
Note that almost 3000 RubyString objects have been allocated. 1000 of
those are ‘foo’, 1000 of them are ‘bar’, and 999 of them are the
combined string. It’s not exactly accurate because the script
terminates before the final allocation gets recorded.
The last number before the class name is the allocation trace. Looking
at 336225 (search in the same file):
TRACE 336225:
org.jruby.RubyBasicObject.(RubyBasicObject.java:219)
org.jruby.RubyObject.(RubyObject.java:94)
org.jruby.RubyString.(RubyString.java:396)
org.jruby.RubyString.(RubyString.java:424)
org.jruby.RubyString.newStringShared(RubyString.java:522)
org.jruby.ast.executable.RuntimeCache.getString(RuntimeCache.java:105)
org.jruby.ast.executable.AbstractScript.getString0(AbstractScript.java:165)
ruby.dash_e.block_0$RUBY$file(-e:1)
ruby$dash_e$block_0$RUBY$file.call(ruby$dash_e$block_0$RUBY$file:65535)
org.jruby.runtime.CompiledBlock.yield(CompiledBlock.java:112)
org.jruby.runtime.CompiledBlock.yield(CompiledBlock.java:95)
org.jruby.runtime.Block.yield(Block.java:130)
org.jruby.RubyFixnum.times(RubyFixnum.java:261)
org.jruby.RubyFixnum$INVOKER$i$0$0$times.call(RubyFixnum$INVOKER$i$0$0$times.gen:65535)
org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:302)
org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:144)
org.jruby.runtime.callsite.CachingCallSite.callIter(CachingCallSite.java:153)
ruby.dash_e.file(-e:1)
ruby.dash_e.load(-e:Unknown line)
org.jruby.Ruby.runScript(Ruby.java:731)
This takes us all the way back to runScript in org.jruby.Ruby. The
-e:1 lines are our inline script. RubyFixnum.times is the “times”
call. And the top seven lines are the parts of JRuby called to
allocate a new literal String. Looking at the 336239 trace…
TRACE 336239:
org.jruby.RubyBasicObject.(RubyBasicObject.java:219)
org.jruby.RubyObject.(RubyObject.java:94)
org.jruby.RubyString.(RubyString.java:396)
org.jruby.RubyString.newString(RubyString.java:487)
org.jruby.RubyString.op_plus(RubyString.java:1039)
org.jruby.RubyString$INVOKER$i$1$0$op_plus.call(RubyString$INVOKER$i$1$0$op_plus.gen:65535)
org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:167)
ruby.dash_e.block_0$RUBY$file(-e:1)
ruby$dash_e$block_0$RUBY$file.call(ruby$dash_e$block_0$RUBY$file:65535)
org.jruby.runtime.CompiledBlock.yield(CompiledBlock.java:112)
org.jruby.runtime.CompiledBlock.yield(CompiledBlock.java:95)
org.jruby.runtime.Block.yield(Block.java:130)
org.jruby.RubyFixnum.times(RubyFixnum.java:261)
org.jruby.RubyFixnum$INVOKER$i$0$0$times.call(RubyFixnum$INVOKER$i$0$0$times.gen:65535)
org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:302)
org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:144)
org.jruby.runtime.callsite.CachingCallSite.callIter(CachingCallSite.java:153)
ruby.dash_e.file(-e:1)
ruby.dash_e.load(-e:Unknown line)
org.jruby.Ruby.runScript(Ruby.java:731)
This is the String#plus call, org.jruby.RubyString.op_plus.
Incidentally, this is also where “class reification” comes in handy:
$ jruby -J-Xrunhprof:depth=20 -Xreify.classes=true -e “class Foo; end;
1_000.times { Foo.new }”
…
17 0.29% 15.50% 23616 984 23616 984 337103 rubyobj.Foo
TRACE 337103:
org.jruby.RubyBasicObject.(RubyBasicObject.java:219)
org.jruby.RubyObject.(RubyObject.java:94)
rubyobj.Foo.(:Unknown line)
sun.reflect.GeneratedConstructorAccessor4.newInstance(:Unknown line)
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
java.lang.reflect.Constructor.newInstance(Constructor.java:513)
java.lang.Class.newInstance0(Class.java:355)
java.lang.Class.newInstance(Class.java:308)
org.jruby.RubyClass$3.allocate(RubyClass.java:142)
org.jruby.RubyClass.allocate(RubyClass.java:223)
org.jruby.RubyClass.newInstance(RubyClass.java:809)
org.jruby.RubyClass$INVOKER$i$newInstance.call(RubyClass$INVOKER$i$newInstance.gen:65535)
org.jruby.internal.runtime.methods.JavaMethod$JavaMethodZeroOrNBlock.call(JavaMethod.java:256)
org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:133)
ruby.dash_e.block_0$RUBY$file(-e:1)
ruby$dash_e$block_0$RUBY$file.call(ruby$dash_e$block_0$RUBY$file:65535)
org.jruby.runtime.CompiledBlock.yield(CompiledBlock.java:112)
org.jruby.runtime.CompiledBlock.yield(CompiledBlock.java:95)
org.jruby.runtime.Block.yield(Block.java:130)
org.jruby.RubyFixnum.times(RubyFixnum.java:261)
runhprof is a very simple way to get allocation info, and I’ve used it
many times to trim allocation overhead.
If you find this information useful, perhaps you can reformat it for a
JRuby wiki page 