Enterprise ruby

rogerdpack · November 11, 2007, 5:46pm

On Nov 11, 8:47 am, Michal S. [email protected] wrote:

No, it’s not dead end. However, I would expect its lifetime something
like 1-2 years. So small tweaks that bring immediate benefit are worth
it. Rewriting the GC probably not. Even if you manage to do it before
1.8 is obsolete it would get intensive use for a few moths at best.
If 2.0 succeeds (and I believe in it) there will be little incentive
to use 1.8 anymore. 2.0 will be the current actively developed
interpreter, and implementing the GC in there makes more sense.

Maybe this was discussed before, but noone mention it.

I think Rubinius team took the idea of doing some “agnostic” GC that
they could switch backends when needed, without changing the API.

Maybe something similar could be done for MRI? Also, thinking on Boehm
Garbage Collector [1] that wasn’t mention any place here, and could
reduce the “time to market” of something.

Anyway, just a shoot in the dark, since my knowledge of GC is too
limited to actually provide more constructive comments

[1] http://www.hpl.hp.com/personal/Hans_Boehm/gc/

rogerdpack · November 11, 2007, 9:12pm

Rick DeNatale wrote:

In this implementation, Java primitives were written in Smalltalk.
IIRC this was either before the JNI existed, or the JNI evolved to
make this impractical, and IBM moved to a Java only VM.

I know that it’s difficult, and probably premature to define a
standard extension interface which would work across the various
emerging ruby implementations. But without that I’m afraid that the
promise of having multiple implementations is somewhat muted.

This implies that it’s only valid to call something an “extension” if
it’s written in C. That’s a bit narrow. JRuby has far better support
than MRI for writing extensions in Java, for example. Does it mean that
JRuby is somehow less-capable than MRI if it can’t use C extensions? Of
course it doesn’t.

The only thing it means is that existing extensions written in C for MRI
won’t work in JRuby. That may limit you, if you depend on those specific
extensions. But in most cases the same functionality is provided by Java
libraries just as well. And even better, you don’t need to compile
anything. You can just call the library directly.

include Java
import java.util.concurrent.ConcurrentHashMap

chm = ConcurrentHashMap.new
chm[:bar] = ‘foo’

etc

This applies to Java’s GUI libraries (around which several frameworks
have been written, all in Ruby), graphics libraries, network libraries,
and so on. Ruby has a much more difficult time using any of these
libraries.

So I’d say it’s a matter of perspective.

Can you write extensions for JRuby? Yes. Can you write them in C? Not
easily.

Can you write extensions for MRI? Yes. Can you write them in Java? Not
easily.

Charlie

rogerdpack · November 11, 2007, 9:15pm

Luis L. wrote:

Maybe this was discussed before, but noone mention it.

I think Rubinius team took the idea of doing some “agnostic” GC that
they could switch backends when needed, without changing the API.

Maybe something similar could be done for MRI? Also, thinking on Boehm
Garbage Collector [1] that wasn’t mention any place here, and could
reduce the “time to market” of something.

The ability to replace the Ruby GC with something a bit more advanced is
severely hindered by–you guessed it–C extensions. Ruby’s extension API
allows access to the objects directly, so extensions can access the data
in-memory. This is anathema to any GC that wants to move objects around,
and of course it’s absolutely impossible to do any GC operations in
parallel if extensions might be holding direct references to memory.

So in the area of GC, at least, Ruby is being hindered by its extension
API rather than helped.

Charlie

rogerdpack · November 11, 2007, 9:07pm

M. Edward (Ed) Borasky wrote:

(JRuby being the most complete and furthest along) or a 1.9
implementation (YARV being most complete and furthest along…but we
have some 1.9 features in JRuby too).

As far as I know, only MRI is “100 percent MRI compatible”. The
other implementations are “extended subsets”. JRuby is for the moment
the most complete subset and has more extensions, i.e., Java libraries,
an AOT compiler and all of the performance tuning that the JRuby team
has done. I haven’t heard much from the Parrot/Cardinal project
recently, but I’m guessing we’ll see IronRuby at close to the level of
JRuby by early next year, and Rubinius some time in the spring.

I didn’t say MRI, I said 1.8…and I said JRuby was “the most complete”,
not “complete”, so I think we basically said the same thing as far as
JRuby goes.

When you say “close to the level” do you mean performance-wise or
completion-wise?

Performance-wise, I wouldn’t be surprised to see IronRuby close to
current JRuby in the next six months; but then we’ll be another six
months on from here too. Rubinius may take a bit longer, since
performance is going to be a tough issue for them.

Completion-wise, Rubinius is way ahead of IronRuby, and may be a rough
tie with Ruby.NET. I would expect Rubinius to stay ahead as far as
API/language support for some time.

In our experience on JRuby, the last 10 or 5% of compatibility has been
by far the hardest, at times requiring rewrites of key subsystems.
Getting 90% complete is great, but it won’t run e.g. Rails. And we still
get occasional bug reports for things not working in Rails.

On another note…talking about performance before you can run apps is
mostly worthless, so it seems like it would be better for alternative
implementations to hold off reporting performance numbers before they
can run real apps.

I don’t think MRI is a dead end at all, considering the discussions
I’ve seen on this list just since I got back from RubyConf. I see people
seriously proposing re-doing the garbage collector, for example, and I
see other people investing a lot of effort in tweaking Rails to use Ruby
and the underlying OS more efficiently.

We shall see.

As far as I know, YARV/KRI is the only serious 1.9 implementation.

We haven’t started implementing 1.9 semantics yet; but I don’t expect it
will take more than a couple months when we do.

I do think that there is probably more excitement and interesting work
on YARV/KRI/1.9 than there is on MRI, or for that matter any of the MRI
extended subsets. But MRI is hardly a dead end IMHO.

You are entitled to your opinion. But perhaps “dead end” was a bit to
strong. How about “done”? I see little more than maintenance happening
on 1.8 in the future.

Charlie

rogerdpack · November 12, 2007, 12:36am

“Would that be useful to anyone? Would anyone use it?”

I dont think I would use it (insofar as I dont really need it, as i use
ruby daily and thus more or less compile all ruby-addons using ruby
scripts anyway) but it sure sounds an interesting project and I would be
curious to know how far it (will) go

“And any other personal tweaks that people contribute. Kind of a
bleeding edge Ruby.”

I think this will be a bit more interesting. I never really used facets,
but I for sure use a lot of my own methods etc… and if people all
stick and improve on one point (like in facets), it could make life a
lot easier (and thus, the project interesting). From this latter aspect,
I am even more curious than the other mention aspect (i.e svn checkout
or GC)

rogerdpack · November 12, 2007, 12:40am

Rick DeNatale wrote:

All of which misses my point.

Unless and until extensions can be written which are portable against
Ruby, JRuby, IronRuby, Rubinius … the Ruby ‘market’ will be
fragmented.

Only as far as extensions go. I don’t advocate writing extensions;
rather, I advocate using rich in-language capabilities to call platform
libraries, as you can do in JRuby. You can extend JRuby without ever
having to write what would be considered an “extension” in MRI. If MRI
had a similar capability, (or perhaps if DL was use more and in better
shape) there would be less fragmentation, since anything
platform-specific could be wrapped by a nice Ruby library, and no C or
Java code would have to be shipped.

It’s the very fact that Ruby extensions are written in C and compiled
with C compilers to shared libraries with very specific constraints that
limits their general applicability to other implementations (and limits
evolution of core Ruby development as well).

So I would propose that what you’re saying is right…and that writing
C-based extensions by hand just aggravates the problem, since C-based
extensions are never going to be generally applicable to all
implementations.

Charlie

rogerdpack · November 12, 2007, 12:22am

On Nov 11, 2007 3:11 PM, Charles Oliver N. [email protected]
wrote:

This implies that it’s only valid to call something an “extension” if
it’s written in C. That’s a bit narrow. JRuby has far better support
than MRI for writing extensions in Java, for example. Does it mean that
JRuby is somehow less-capable than MRI if it can’t use C extensions? Of
course it doesn’t.

The only thing it means is that existing extensions written in C for MRI
won’t work in JRuby.

JRuby specific extension example omitted.

So I’d say it’s a matter of perspective.

Can you write extensions for JRuby? Yes. Can you write them in C? Not
easily.

Can you write extensions for MRI? Yes. Can you write them in Java? Not
easily.

All of which misses my point.

Unless and until extensions can be written which are portable against
Ruby, JRuby, IronRuby, Rubinius … the Ruby ‘market’ will be
fragmented.

–
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

rogerdpack · November 12, 2007, 1:14am

On Nov 11, 2007 6:39 PM, Charles Oliver N. [email protected]
wrote:

Rick DeNatale wrote:

All of which misses my point.

Unless and until extensions can be written which are portable against
Ruby, JRuby, IronRuby, Rubinius … the Ruby ‘market’ will be
fragmented.

It’s the very fact that Ruby extensions are written in C and compiled
with C compilers to shared libraries with very specific constraints that
limits their general applicability to other implementations (and limits
evolution of core Ruby development as well).

So I would propose that what you’re saying is right…and that writing
C-based extensions by hand just aggravates the problem, since C-based
extensions are never going to be generally applicable to all
implementations.

Well I never said that portable extensions needed to be written in C.

So maybe we should be rooting for Evan, and hope that we can soon be
writing extensions in a ruby-like version of Squeak’s Slang.

–
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

rogerdpack · November 12, 2007, 1:18am

Rick DeNatale wrote:

Well I never said that portable extensions needed to be written in C.

So maybe we should be rooting for Evan, and hope that we can soon be
writing extensions in a ruby-like version of Squeak’s Slang.

Maybe we shouldn’t be “rooting” for anyone, and we should instead be
lending a hand by trying to solve the problem ourselves

Charlie

rogerdpack · November 12, 2007, 3:14am

I believe Rubinius actually “fakes” the MRI extensions API so your
Ruby extensions should work just fine on rbx.

(Evan, Yehuda, or someone else smarter than I please chime in if
that’s not true…)

–Jeremy

On Nov 11, 2007 7:13 PM, Rick DeNatale [email protected] wrote:

limits their general applicability to other implementations (and limits
writing extensions in a ruby-like version of Squeak’s Slang.

–
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

–
http://www.jeremymcanally.com/

My books:
Ruby in Practice

My free Ruby e-book

My blogs:

http://www.rubyinpractice.com/

rogerdpack · November 12, 2007, 4:33am

Rick DeNatale wrote:

So I would propose that what you’re saying is right…and that writing
C-based extensions by hand just aggravates the problem, since C-based
extensions are never going to be generally applicable to all
implementations.
Well I never said that portable extensions needed to be written in C.

So maybe we should be rooting for Evan, and hope that we can soon be
writing extensions in a ruby-like version of Squeak’s Slang.

Welll … there are only a few reasons why one would write an extension
in the first place:

To work with a data type not native to the language. As far as I
know, the only commonly-used data type that doesn’t have native
operations in MRI is large multidimensional arrays of floating point
(possibly complex) numeric data. So we need NArray and Ruby interfaces
to C code like GSL, and Ruby becomes a “control language” rather than a
data processing language.
To use existing functionality already written in another language.
This is why most extensions get written, and since most of the
functionality is in C/C++, most of the extensions are written in C. In
jRuby, of course, rather than use existing functionality in C, one uses
existing functionality written in Java. Again, Ruby becomes a “control
language” rather than a data processing language.
To get more processor- or memory-efficient operation.
Because it’s open source and we can.

I think, of all these reasons, 4 is the only legitimate reason to
write an extension in a language other than Ruby. For 1, as I noted, as
far as I know there’s only one missing data type, and I think it could
be easily integrated into any of the implementations. It’s just a matter
of the community deciding that it should be there and coming up with a
syntax and semantics workable in all of the implementations.

For 2, given that Ruby operates on most platforms, it’s almost always
possible to accomplish 2 using “loose coupling” with little or no
penalty. For 3, I think YARV and jRuby have both shown that one can stay
in Ruby and still gain significant efficiency. And Rubinius has as its
goal minimizing the size of the C core without losing efficiency, and I
think they’re well on their way to that.

So that leaves 4 … because it’s open source and we can.

rogerdpack · November 12, 2007, 1:11pm

On Sun, 11 Nov 2007 22:32:14 -0500, M. Edward (Ed) Borasky wrote:

For 2, given that Ruby operates on most platforms, it’s almost always
possible to accomplish 2 using “loose coupling” with little or no
penalty.

Possible, yes. (Again, Turing-complete and all that.) But it depends
what
you mean by “penalty”.

The very first thing I think of when I think “Ruby and native
extensions”
is RMagick/ImageMagick for a web site. And that’s notoriously difficult
to
build even in a C-based, MRI environment, although I understand that’s
supposed to be better now in 2.0 (I haven’t tried it yet).

What would the “right answer” be in JRuby? I just did some Googling,
and
from what I can tell, there are a few ways you could go:

JAI, which a bunch of people claim is too abstracted and buggy, and
which
a bunch of others claim is fine and the first people are whiners
Mistral, which provides an opaque abstraction over JAI, probably
making
the first two bunches of people look like Red Sox and Yankees fans in a
bar
with Scottish soccer fans
ImageMagick via JNI (!)
ImageMagick via exec (!!)
ImageJ, which is pure Java but apparently has a lot of problems
running
on a headless server due to its use of AWT; there’s now a headless
version
of the ImageJA fork, but apparently if you use any plugins you have to
go
stub out all their AWT calls as wel, or something like that. Definitely
not a mature solution yet.

None of these sound all that appealing. You talk about loose coupling,
and
when I think loose coupling, I think messaging. Which sounds good, and
obviously lends itself to parallel computing and multicore and grids and
all that, but as the Java world certainly knows, wrapping something like
ImageMagick in a messaging server would be a huge rat’s-nest-infested
undertaking - and something that only someone with good experience in
Ruby
AND Java AND image processing AND messaging could do.

As someone who’s running about 1.5 out of 4 there, I’d probably just go
create a DRb/Rinda/whatever server for the ImageMagick stuff and deal
with
having both a Java and a C environment going. But I wouldn’t be very
happy
about it, and it’s certainly not generalizable.

You almost want some kind of messaging-oriented SWIG. Except SWIG
itself
isn’t even that easy to use (every time I think about using it, I end up
drifting toward Ruby::Inline instead; I think you only get the SWIG
advantage if you really target multiple languages.) So you want a
messaging-oriented BETTER SWIG. Without any of the complexity of SWIG.
Or
messaging.

Thoughts?

rogerdpack · November 12, 2007, 6:22pm

M. Edward (Ed) Borasky wrote:

I’m genuinely curious about this – MATLAB is a (very expensive closed
source) package that caters to numerical processing, including signal
and image processing. Can MATLAB process regular expressions? Can
Mathematica? Or does one need a two-language solution – MATLAB for the
number crunching and Perl, Python or Ruby for the data extraction?

String processing in MATLAB (Octave, really) is agony, so I tend to use
a hybrid of Perl and Octave. I’ve tried Ruby plus NArray/rb-gsl and
Perl plus PDL, but neither has Octave’s huge number of linear algebra
functions (SVD, least-squares, eigenvalues, …) or plotting support.
Compact arrays are crucial, but they’re only a small part of the story.

rogerdpack · November 12, 2007, 7:04pm

Jay L. wrote:

is RMagick/ImageMagick for a web site. And that’s notoriously difficult to
build even in a C-based, MRI environment, although I understand that’s
supposed to be better now in 2.0 (I haven’t tried it yet).

What would the “right answer” be in JRuby? I just did some Googling, and
from what I can tell, there are a few ways you could go:

The truth is there’s probably another dozen options beyond those listed
here, so it’s hard to say what the best approach would be. There’s even
a Java-based ImageMagick that either wraps or duplicates IM’s
functionality.

So I guess there’s an important introductory question to ask before we
would start investigating:

Is it ImageMagick’s exact specifications or general image processing we
want to provide? The former is obviously much harder, while the latter
can be done through any number of libraries.

I think you’d be hard-pressed to find any library in the C world that
doesn’t have a reasonably good equivalent in the Java world (sometimes
that just wrap those libraries from the C world). So it’s certainly
possible and reasonable to say most extensions could be provided in a
similar-if-not-identical form under JRuby with just a bit of plumbing.
And thankfully JRuby’s integration with Java means that plumbing should
be less than equivalent in the C world (i.e. the C world is fraught with
peril, as RMagick’s occasional instability has shown).

Charlie

rogerdpack · November 12, 2007, 7:39pm

Charles Oliver N. wrote:

implementation (YARV being most complete and furthest along…but we
have some 1.9 features in JRuby too).

As far as I know, only MRI is “100 percent MRI compatible”. The
…

I don’t think MRI is a dead end at all, considering the discussions
I’ve seen on this list just since I got back from RubyConf. I see people
seriously proposing re-doing the garbage collector, for example, and I
see other people investing a lot of effort in tweaking Rails to use Ruby
and the underlying OS more efficiently.

In retrospect, maybe it would be worth more long term investment for
1.9, but I still favor ‘tweaks’ for 1.8.6 as being useful.

So that would mean to apply existing ‘useful’
patches/observations/tweaks to 1.8.6 (a small amount of work), and then
do the big jobs on jruby or 1.9. The reason I say this is that beyond
the already existing 1.8.6 patches (there’s a coy-on-write (COW)
friendly GC, and you can resize the heap chunks to keep memory low), I
don’t think we’ll get much more speed savings unless we drastically
alter the GC (i.e. went to generation or, my fav. reference checking).
Therefore if we did completely re-write the GC, to get those ‘real speed
boosts’ the usefulness would be short sighted, as it would be being
phased out because of slower speed overall compared to 1.9.

So if one were to re-write the GC, since it’s a large task, doing it for
1.9 makes sense. Small tweaks, though, don’t cost much to make
(tweaking parameters, not rewriting code). Like GCC flag optimization,
smaller memory use.

I hate to say it but with this post, I’m operating under the assumption
that 1.9 is going to become the ‘most popular ruby interpreter’ and
hence worth investing time into. It’s possible that the time
investments should be made into Jruby, should it somehow swamp the
market. For now it seems the MRI 1.9 will be fastest, so most useful to
optimize. Any thoughts? Oh wait is like a self-fulfilling decision…

Also I can’t think of a cool Jruby project off the top of my head that
would be fun to ‘fix.’ – I’m unfamiliar with its bottlenecks.

Have a good one!
-Roger

rogerdpack · November 13, 2007, 6:39am

Sean S. wrote:

functions (SVD, least-squares, eigenvalues, …) or plotting support.
Compact arrays are crucial, but they’re only a small part of the story.
GSL doesn’t have SVD, least-squares and eigenvalues? Well then – you
need (pregnant pause) RSRuby!! It hooks into the whole R language shared
library, and I think it can get to the BLAS and LAPACK libraries in R
too. But I really thought LAPACK was part of GSL.

rogerdpack · November 12, 2007, 4:02pm

Jay L. wrote:

is RMagick/ImageMagick for a web site. And that’s notoriously difficult to
build even in a C-based, MRI environment, although I understand that’s
supposed to be better now in 2.0 (I haven’t tried it yet).

I actually thought about ImageMagick when I wrote that. Doesn’t
ImageMagick have a command line interface? Can’t Ruby (and jRuby)
execute command lines? That’s what I had in mind when I wrote “loose
coupling”.

[snip]

You almost want some kind of messaging-oriented SWIG. Except SWIG itself
isn’t even that easy to use (every time I think about using it, I end up
drifting toward Ruby::Inline instead; I think you only get the SWIG
advantage if you really target multiple languages.) So you want a
messaging-oriented BETTER SWIG. Without any of the complexity of SWIG. Or
messaging.

SWIG requires deep knowledge of the library you are attempting to engulf
in your scripting language. Indeed, the payoff is primarily if you want
one library to serve many scripting languages, not just Ruby. However,
if you want to declare Ruby the one true language, Ruby::Inline is the
way to go.

Thoughts?

Well … I’m sticking with “because it’s open source and we can” as the
best reason for extending a language. Implicit in that is a plea to
extend the language to the primitive data types necessary for, say,
signal and image processing, which are – surprise – multidimensional
arrays of numeric data packed contiguously in RAM.

I’m genuinely curious about this – MATLAB is a (very expensive closed
source) package that caters to numerical processing, including signal
and image processing. Can MATLAB process regular expressions? Can
Mathematica? Or does one need a two-language solution – MATLAB for the
number crunching and Perl, Python or Ruby for the data extraction?

rogerdpack · November 13, 2007, 2:15pm

M. Edward (Ed) Borasky wrote:

GSL doesn’t have SVD, least-squares and eigenvalues?

IIRC GSL has these things (and BLAS and LAPACK), but it lacks others,
such as NArray’s convenient matrix slice-and-dice syntax. RSRuby has
lots of great statistical routines, but probably can’t share data with
NArray or GSL. Each solves its own part of the problem well, but
someone still needs to sit down and write efficient (non-copying) and
transparent glue between them. A Matlab transition cheat-sheet would
also help (Numpy has one of these).

rogerdpack · November 14, 2007, 4:39am

M. Edward (Ed) Borasky wrote:

As far as sharing data is concerned, to keep everybody’s garbage
collectors and memory allocators happy and segfault-free, you probably
need to do explicit transfers of data between the various packages,

This is a significant barrier. Having to marshal and unmarshal your
data whenever you go from one domain to another is annoying. It’s a
large part of what we’re trying to avoid in moving away from a
multi-language solution. I think Ruby’s got the right ingredients to
pull this stuff together: dynamism, easy C interaction, and flexible and
unobtrusive syntax. But someone needs to sit down and do some Serious
Work before it’s a viable Matlab replacement. I’ve made a couple of
half-hearted tries, but it’s hard to make progress piece-by-piece when
you’re trying to generate results right-this-minute.

rogerdpack · November 13, 2007, 3:26pm

Sean S. wrote:

M. Edward (Ed) Borasky wrote:

GSL doesn’t have SVD, least-squares and eigenvalues?

IIRC GSL has these things (and BLAS and LAPACK), but it lacks others,
such as NArray’s convenient matrix slice-and-dice syntax. RSRuby has
lots of great statistical routines, but probably can’t share data with
NArray or GSL. Each solves its own part of the problem well, but
someone still needs to sit down and write efficient (non-copying) and
transparent glue between them. A Matlab transition cheat-sheet would
also help (Numpy has one of these).

There is an Octave-to-R cheat sheet available at
http://cran.r-project.org/doc/contrib/R-and-octave.txt

As far as sharing data is concerned, to keep everybody’s garbage
collectors and memory allocators happy and segfault-free, you probably
need to do explicit transfers of data between the various packages,
rather than trying to maintain your sanity and pass pointers at the same
time. Depending on the size of the datasets involved, though, you
might want to look at something like SQLite files. They’re fast, look
like an RDBMS, and both Ruby and R can talk to them. And if you have
enough RAM, the OS will buffer them for you. Of course, the datasets I
deal with are so large I have to put them in PostgreSQL anyhow.