JRuby forkoff replacement?

In my MRI-targeted code I use the forkoff¹ gem, which adds the
Enumerable#forkoff method that’s basically a #map running its &block’s
contents in separate processes.

¹ http://codeforpeople.com/lib/ruby/forkoff/forkoff-0.0.4/README

In my code, I incorporated #forkoff like this:

module Enumerable
def parallel &block
case ArtDecomp::Conf.processes
when 1 then map(&block)
else forkoff ArtDecomp::Conf.processes, &block
end
end
end

Trying to run #parallel with Conf.processes set to 2 in JRuby gives me
the (understandable) ‘fork is unsafe and disabled by default on JRuby’
error.

What would be the JRuby’s equivalent? In other words:
how would one run such code in parallel properly in JRuby?

Confession 1: I know I should figure the above by myself, but I haven’t
yet dug into JRuby, I’m just evaluating its purpose for my case (PhD
computations) – I need to run my code faster, and I want to see how
fast I can go with JRuby before I give up on pure-Ruby speed and go
the one-way road of RubyInline…

Confession 2: My knowledge about threads and processes is murky to say
the least. All I know is that MRI’s global interpreter lock means two
‘parallel’ threads aren’t really parallel and won’t use multiple cores
while two processes will; also, IIRC, JRuby maps threads to JVM’s
processes and so JRuby threads are really parallel – if that’s true,
I’m ok with running my code in threads under JRuby.

Confession 3: I should be using DRb for the above anyway, and that’s the
plan for the future, but for now I need code that utilises two cores in
full.

– Shot

Shot (Piotr S.) wrote:

What would be the JRuby’s equivalent? In other words:
how would one run such code in parallel properly in JRuby?

Just use threads…see below.

Confession 1: I know I should figure the above by myself, but I haven’t
yet dug into JRuby, I’m just evaluating its purpose for my case (PhD
computations) – I need to run my code faster, and I want to see how
fast I can go with JRuby before I give up on pure-Ruby speed and go
the one-way road of RubyInline…

You should strongly consider using Java before using RubyInline, since
it’s going to be easier to maintain, less prone to C’s various
fatalities, just as fast, and trivially callable from JRuby. And if you
are really interested in inlining that code, there’s also a java_inline
module I wrote that plugs into RubyInline:

http://projectkenai.com/projects/java-inline

It’s very primitive so far, but works pretty well. See the examples. I’m
looking for others to help improve it, if it’s desired.

Confession 2: My knowledge about threads and processes is murky to say
the least. All I know is that MRI’s global interpreter lock means two
‘parallel’ threads aren’t really parallel and won’t use multiple cores
while two processes will; also, IIRC, JRuby maps threads to JVM’s
processes and so JRuby threads are really parallel – if that’s true,
I’m ok with running my code in threads under JRuby.

This is exactly how you should do it. You should take care to protect
common data structures, but under JRuby threads actually do run in
parallel and you can achieve what forkoff does without without using
fork.

I think Ara also released another gem that trivially parallizes using
threads, but the name escapes me. I’m sure it’s under codeforpeople
though.

  • Charlie

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Charles Oliver N.:

Shot (Piotr S.) wrote:

What would be the JRuby’s equivalent? In other words:
how would one run such code in parallel properly in JRuby?

Just use threads…see below.

Ok, thanks.

You should strongly consider using Java before using RubyInline,
since it’s going to be easier to maintain, less prone to C’s
various fatalities, just as fast, and trivially callable from
JRuby.

Hm. On one hand, I know a bit of C and almost zero Java, and my
assumption is that in the long run the C solution would be the fastest.
On the other, my gut feeling is that learning Java (and its connections
to JRuby, which, from my quick look at them, are rather intuitive) in
my particular case would be faster and more fun than relearning C with
its VALUE pointers, memory management and MRI’s bindings. On the third,
I’ve yet to try RubyInline+RubyToC combo, but I guess it either works as
advertised (and then there’s no C to learn) or it doesn’t (and I’m back
in C land).

I’m curious about the ‘just as fast’ claim above – my research code
ended up being mostly bitwise operations on variable-length integers
(which are lovely to work with in Ruby due to the Fixnum/Bignum
transparent switching). My base class is a glorified Set of Integers
– hence my assumptions that I might be able to learn a small enough
subset of MRI C API to implement what I need via RubyInline in
a relatively short period of time.

I have no idea about JVM’s performance other than the most popular
myths(?) from both sides of the discussion, but the cases that require
the optimisations are the ones that will end up being computed for hours
or even days, so I can at least safely skip any warm-up disadvantage.

Does JVM indeed approach C speed in this type of computations in my
above case (Sets of Integers, long running code)? I’m just looking
for a ballpark quote, of course – something like ‘most probably yes’,
‘maybe’ or ‘no way, JVM will be at least X times slower’.

And if you are really interested in inlining that code, there’s
also a java_inline module I wrote that plugs into RubyInline

Cool! I wasn’t aware of it. If I have the time, I might end up
implementing the bottlenecks in both and then switch on runtime
based on the interpreter. :]

I tried a quick google, but it’s surprisingly hard to find if you don’t
know the right words; if you’re replying anyway – what was the constant
that I can use to differentiate between MRI and JRuby?

(Hm, one of the disadvantages of going with JRuby is that my profiling
knowledge is limited to ruby-prof, but then if I learned ruby-prof on
an afternoon then maybe I can learn the basics of a Java profiler in
a weekend.)

I think Ara also released another gem that trivially parallizes using
threads, but the name escapes me. I’m sure it’s under codeforpeople
though.

Ah, right, how could I forget! It’s threadify. :slight_smile:

Also, my crude benchmarks are just in: running an actual, small
‘production’ case with one process took 28m57s on MRI 1.8.6.p287,
12m18s on MRI 1.9.1-preview2 and 9m20s on JRuby 1.1.5. Hmmm. :slight_smile:

Thanks a lot for your reply! It’s much appreciated.

– Shot

Shot (Piotr S.):

I tried a quick google, but it’s surprisingly hard to find if you
don’t know the right words; if you’re replying anyway – what was
the constant that I can use to differentiate between MRI and JRuby?

Ah, RUBY_PLATFORM and RUBY_ENGINE. :slight_smile:

– Shot

On Fri, Dec 5, 2008 at 11:00 AM, Shot (Piotr S.) [email protected]
wrote:

Shot (Piotr S.):

I tried a quick google, but it’s surprisingly hard to find if you
don’t know the right words; if you’re replying anyway – what was
the constant that I can use to differentiate between MRI and JRuby?

Ah, RUBY_PLATFORM and RUBY_ENGINE. :slight_smile:

There is also JRUBY_VERSION, which, if defined, with 100% probability
detects JRuby :slight_smile:

–Vladimir


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email