Questions and comments about (java) extensions

Hi all,

In the process of porting the hitimes C extension to java I came up with
a few
questions along the way. I initially read Ola’s blog post[1] and then
took a
look at the built in extensions (WeakRef, Readline, Socket, etc) and saw
they
used annotations intead of the MRI style from Ola’s blogpost. I decided
to
head down the annotations path.

In no particular order:

  1. BasicLibraryService vs. Library

    I started out looking at the built in extensions, and they all
    implement
    Library, not BasicLibraryService. I assume now that this is because
    they are
    ‘built in extensions’ and hence are ‘require’ via a different path.
    This
    seems to be validated when you look at Ruby#initBuiltins[2].

    Of course it wasn’t until this point that I saw the javadoc at the
    top
    of BasicLibraryService:

    “This interface should be implemented by writers of Java extensions
    to
    JRuby.”[3]

    Just to clarify, if the extension ships in the same jar as jruby,
    then it
    implements Library and is added to initBuiltins(). If it is in an
    external
    jar, it implements BasicLibraryService.

    Is this correct?

  2. Accessing the Runtime or ThreadContext inside the extension and
    method first
    arguments.

    I needed to access the ruby runtime to be able to return ‘nil’,
    ‘true’ and
    ‘false’ from various methods, and I needed to access the
    ThreadContext to
    implement yielding.

    Wayne pointed out[4] that when you need to access the Ruby runtime it
    is
    better to pass ThreadContext as the first argument to the method that
    needs
    to access the runtime than it is to access this.getRuntime().

    Is this the appropriate thing to do?

    Also, should the first argument of the extension methods ALWAYS be
    ThreadContext or IRubyObject? Even if it not used in the method
    implementation?

  3. Annotations

    First off, I really like this approach. It makes things really easy
    to
    understand. You get the declaration of ruby interface of the
    method/class
    right next to the java implemenation of that method/class.

    3.1) @JRubyMethod annotation

     * Aliases, what to do?  Are these equivalent?
    
       @JRubyMethod( name = "duration", alias = { "length", "to_f", 
    

“to_seconds” } )
@JRubyMethod( name = { “duration”, “length”, “to_f”,
“to_seconds” } )

    * For module methods (@JRubyMethod( name = "now", module = true 

)) what
should the java signature look like? Is there a distinction
between
these two?

      public static IRubyObject now( IRubyObject self ) { ... }
      public static IRubyObject now( ThreadContext context ) { .. }

    * For methods that yield (@JRubyMethod( name = "measure", module 

= true, frame = true ))
should they always set ‘frame = true’ in the annotation and
what
should the java signatures look like:

      public static IRubyObject measure( IRubyObject self, Block 

block ) { … }
public static IRubyObject measure( ThreadContext context,
Block block ) { … }

3.2) @JRubyClass annotation

    It appears that there are constraints on the actual java class 

names
that implement ruby classes. This works and from inside ruby,
Hitimes::Stats is found.

        @JRubyClass( name = "Hitimes::Stats" )
        public class HitimesStats extends RubyObject { ... }

    When implemented in this manner, Hitimes::Stats is NOT found 

inside
ruby.

        @JRubyClass( name = "Hitimes::Stats" )
        public class Stats extends RubyObject { ... }

    I did not try with inner classes. Maybe this works, I may give 

it a try
later:

        @JRubyModule( name = "Hitimes" )
        public class Hitimes {
            @JRubyClass( name = "Hitimes::Stats" )
            public static class Stats { .. }
        }
  1. Jar file location in the gem

    Initially I had the jar in the gem at ‘lib/hitimes/hitimes.jar’ which
    worked
    fine. Then, in preparation for release, I wanted to change the jar
    location
    so it would be in a similar directory structure to what I have for
    the
    Windows releases of hitimes
    (“lib/hitimes/#{RUBY_VERSION.sub(/.\d$/,‘’)}/hitimes_ext.dll”).

    I wanted to put the jar in ‘lib/hitimes/java/hitimes.jar’ and then
    require it
    with ‘require “hitimes/java/hitimes”’ in the top level hitimes.rb.
    This did
    not work. The jar was never found, and I had to move it back to
    ‘lib/hitimes/hitimes.jar’.

    Is this working as designed?

  2. ObjectAllocator

    It seems that this is some boilerplate code that needs to go into
    anything
    that is registered with RubyModule#defineClassUnder(). Is this
    strictly
    required? Is there a default that can be used if you do not have
    custom
    allocation needs?

Thoughts? Questions? Comments?

enjoy,

-jeremy

[1]

[2]
http://github.com/jruby/jruby/blob/master/src/org/jruby/Ruby.java#L1421
[3]
http://github.com/jruby/jruby/blob/master/src/org/jruby/runtime/load/BasicLibraryService.java#L34
[4] http://markmail.org/message/2giww4tbzxxpnvl7

Jeremy H. [email protected]


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

On 28 July 2010 06:18, Jeremy H. [email protected]
wrote:

  1. ObjectAllocator

It seems that this is some boilerplate code that needs to go into anything
that is registered with RubyModule#defineClassUnder(). Is this strictly
required? Is there a default that can be used if you do not have custom
allocation needs?

ObjectAllocator is roughly equivalent to rb_define_alloc_func() in MRI

  • if you define a new java subclass for your ruby class, and you want
    to be able to do:

    foo = MyRubyClass.new

Then you need to define a custom allocator, so MyRubyClass.new returns
an instance of the correct java subclass.

If you don’t want to create instances of your class from ruby (but you
can still create them from java code if you want to), then you can use
ObjectAllocator.NOT_ALLOCATABLE_ALLOCATOR as the allocator.

If you’re not defining a new java class for your ruby class, then you
could probably use RubyObject.OBJECT_ALLOCATOR. That would be quite
rare though, as you usually want your own custom java class to hold
custom state.


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

On Wed, Jul 28, 2010 at 1:24 AM, Wayne M. [email protected]
wrote:

If you’re not defining a new java class for your ruby class, then you
could probably use RubyObject.OBJECT_ALLOCATOR. Â That would be quite
rare though, as you usually want your own custom java class to hold
custom state.

This is actually somewhat common in the JRuby core classes, usually by
overloading the dataWrapStruct slot for custom data. I’d say it’s
easier and more elegant to create the custom class, but it’s sometimes
nice to just use RubyObject, dataGetStruct, and a bit of casting to
get the same thing done.

Note also that with “class reification” work in JRuby 1.6 (which I
discussed in my memory-monitoring blog posts), what we do under the
covers is create an actual Java class for every Ruby class on first
construction. In extension terms, it’s basically like the moment you
first construct a Ruby object, we stand up a RubyObject subclass and
an ObjectAllocator subclass and from then on instances of that Ruby
class are physically instances of a unique Java class too.

  • Charlie

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

On Tue, Jul 27, 2010 at 3:18 PM, Jeremy H.
[email protected] wrote:

Hi all,

In the process of porting the hitimes C extension to java I came up with a few
questions along the way. Â I initially read Ola’s blog post[1] and then took a
look at the built in extensions (WeakRef, Readline, Socket, etc) and saw they
used annotations intead of the MRI style from Ola’s blogpost. Â I decided to
head down the annotations path.

The annotation style is strongly preferred, if only because it’s so
much more elegant than manually binding all the methods (and you don’t
have to get arities, etc right most of the time).

 of BasicLibraryService:

  “This interface should be implemented by writers of Java extensions to
  JRuby.”[3]

 Just to clarify, if the extension ships in the same jar as jruby, then it
 implements Library and is added to initBuiltins().  If it is in an external
 jar, it implements BasicLibraryService.

 Is this correct?

Basically, there’s no discovery mechanism for Library-based
extensions, so they need to be explicitly loaded. BasicLibraryService
was a half-baked attempt (I think I added it) to provide a standard
way to lookup and initialize extensions to JRuby coming from outside
the core classes. I always figured we’d unify all the various loading
mechanisms before people started using them for their own extensions.
C’est la vie!

 Is this the appropriate thing to do?

It probably doesn’t make a huge difference, but there’s two good reasons
for it:

  • ThreadContext as the first argument is almost “free”, since it’s
    passed through the call stack for almost all calls
  • The one-hop field access for ThreadContext.getRuntime() is quicker
    than the two-hop object-to-metaclass-to-runtime lookup

It’s better mostly because it’s a single memory access of a field
that’s almost certainly going to be very hot in CPU cache, versus a
two-hop that’s going to be all over memory.

 Also, should the first argument of the extension methods ALWAYS be
 ThreadContext or IRubyObject?  Even if it not used in the method
 implementation?

At the moment, for static methods you need at least the “self” the
method will be called against, even if it’s a module you don’t need.
If you use ThreadContext in any case, it’s always the first argument.
If you use a block, it’s always the last argument. We could (and
probably will) support more combinations of arguments (like being able
to specify overloads and we’ll route accordingly), but the current set
serves us pretty well.

     @JRubyMethod( name = “duration”, alias = { “length”, “to_f”, “to_seconds” } )
     @JRubyMethod( name = { “duration”, “length”, “to_f”, “to_seconds” } )

Almost the same. In the name-only case, we rebind the same physical
method object to multiple names. In the alias case, we bind
AliasMethod wrappers pointing at the method object for the main name,
as though you did alias_method yourself. This has subtle effects on
“super” and friends. alias was originally added before we realized
that for core methods, MRI just rebinds the same method object to
multiple names (and therefore they are not “true” aliases). Now I
think we use name almost everywhere, but alias remains if you really
want alias-like behavior.

    * For module methods (@JRubyMethod( name = “now”, module = true )) what
     should the java signature look like? Is there a distinction between
     these two?

     public static IRubyObject now( IRubyObject self ) { … }
     public static IRubyObject now( ThreadContext context ) { … }

static methods, whether bound on an instance or on a class (remember
there’s really no “static” methods in Ruby…every method is an
instance method on something) should always have at least one
IRubyObject self argument, and shouldn’t bind if it’s not there.
ThreadContext is always optional.

Usually static methods are used for module or class-level methods, but
if you’re using RubyObject as your backing store and dataGetStruct to
store custom data, you will probably bind all static methods for
instance methods since there’s no custom Java class on which to make
them instance methods.

Make sense?

    * For methods that yield (@JRubyMethod( name = “measure”, module = true, frame = true ))
     should they always set ‘frame = true’ in the annotation and what
     should the java signatures look like:

     public static IRubyObject measure( IRubyObject self, Block block ) { … }
     public static IRubyObject measure( ThreadContext context, Block block ) { … }

“frame = true” is mostly used to ensure there’s an entry in the
backtrace for the method. It does slow down invocation, since for each
call we need to mark that we’re entering this method and safe off the
caller’s line number, etc, but if your code (or code it calls) might
raise an exception, omitting “frame = true” will cause your method to
be invisible.

The default is “frame = false” mostly because the core classes have
many tiny methods called very heavily (like Fixnum math) that if
framed would be prohibitively slow. I’m working now on a better way to
do backtraces that doesn’t require us to track so much data on every
call.

    ruby.
        public static class Stats { … }
      }

I’m not sure we actually use JRubyClass and JRubyModule consistently,
and I know I rarely use them in my own extensions. We had originally
planned for you to be able to just mark up a class with all these
annotations and both creation of the Ruby module or class and the
binding of the methods would happen for you. I don’t think we ever
finished all the edge cases (as you have seen) to make it generally as
useful as it could be.

 not work.  The jar was never found, and I had to move it back to
 ‘lib/hitimes/hitimes.jar’.

 Is this working as designed?

If requiring hitimes/hitimes worked and hitimes/java/hitimes did not,
that’s not right. The path structure leading to the jar shouldn’t
matter. Perhaps a bug?

  1. ObjectAllocator

 It seems that this is some boilerplate code that needs to go into anything
 that is registered with RubyModule#defineClassUnder().  Is this strictly
 required?  Is there a default that can be used if you do not have custom
 allocation needs?

Wayne covered this pretty well. I’d like for this to be lighter, but
if we don’t use a little custom class for the allocator the only
alternative is to use Java reflection, which would add overhead to
every object construction. You can use the following methods to
avoid creating a custom allocator, if the reflection-instantiation
isn’t too slow for you:

  • RubyClass.setClassAllocator(Class) - assumes you have a no-arg
    constructor and extend RubyBasicObject or a subclass of it. Object
    will be constructed using no-arg constructor and then setMetaClass
    called on it before returning from allocation.
  • RubyClass.setRubyClassAllocator(Class) - assumes you have a
    “standard” (Ruby, RubyClass) constructor and uses that to construct
    instances through reflection.
  • RubyClass.setRubyStaticAllocator(Class) - looks for a static
    allocate(Ruby, RubyClass) method and uses that to construct
    instances through reflection.

It’s also worth pointing out that in Java 7 and higher, these
mechanisms will probably use MethodHandles, avoiding the cost of
reflective dispatch…so the boilerplate may just go away.

  • Charlie

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email