Forum: Ruby ruby bounties--list of bounties

947c97a2c119e85989d2ca63135a5b5e?d=identicon&s=25 Roger Pack (Guest)
on 2010-01-21 07:13
(Received via mailing list)
Fifth time's a charm (durn spam filters)

========Announcing the creation of a "ruby bounty" list=============

Purpose: connect those who want to work on side projects with those who
want
to support them.

http://wiki.github.com/rdp/ruby_bounties/ruby-bounties

Is meant to be a place where you can list projects you'd like to see
completed and would be willing to help pay for.

And also a place for people looking for something to do in their spare
time to go.

(Ok, ok so it's mostly a list of projects I wish I could work on, but
don't have time for, but thought I'd try to make a community resource,
if others want to add to the list or what not.)

Enjoy.

-rp
Cb7c371146108bd4abc3c00e20ad1137?d=identicon&s=25 Mark T (Guest)
on 2010-01-21 08:54
(Received via mailing list)
Sure would like to be able to afford a bounty on a Ruby (framework
agnostic) OWASP ESAPI.

http://www.owasp.org/index.php/Category:OWASP_Ente...

MarkT
Bec38d63650c8912b6ba9b557fb953b9?d=identicon&s=25 Roger Pack (rogerdpack)
on 2010-01-21 18:30
> Sure would like to be able to afford a bounty on a Ruby (framework
> agnostic) OWASP ESAPI.
>
> http://www.owasp.org/index.php/Category:OWASP_Ente...
>
> MarkT

You could add just like a $10 one to the list--I'll chip in a few bucks.

-r
F1d37642fdaa1662ff46e4c65731e9ab?d=identicon&s=25 Charles Nutter (headius)
on 2010-01-21 21:57
(Received via mailing list)
On Thu, Jan 21, 2010 at 12:07 AM, Roger Pack <rogerdpack2@gmail.com>
wrote:
> Fifth time's a charm (durn spam filters)
>
> ========Announcing the creation of a "ruby bounty" list=============

I added a couple more for JRuby:

JRuby C Extension support - we have an early start at such a library
(http://github.com/wmeissner/jruby-cext) but we need more C hackers
willing to help us build it out. We will only support the "safe" C
functions that don't let you manipulate objects internals directly or
get direct access to pointers.

Pure-Java Nokogiri - Just what it sounds like...for places where
libxml isn't available or native libraries are forbidden.

Dive in :)

- Charlie
Bec38d63650c8912b6ba9b557fb953b9?d=identicon&s=25 Roger Pack (rogerdpack)
on 2010-01-24 06:49
> Pure-Java Nokogiri - Just what it sounds like...for places where
> libxml isn't available or native libraries are forbidden.
>
> Dive in :)

Looks like pure java Nokogiri is something popular--the bounty on it has
already risen to $225

-r
F1d37642fdaa1662ff46e4c65731e9ab?d=identicon&s=25 Charles Nutter (headius)
on 2010-01-24 13:16
(Received via mailing list)
On Sat, Jan 23, 2010 at 11:49 PM, Roger Pack <rogerpack2005@gmail.com>
wrote:
> Looks like pure java Nokogiri is something popular--the bounty on it has
> already risen to $225

It's probably the most oft-encountered stumbling block for folks using
JRuby (these days), since Nokogiri itself has become very popular and
is now depended on by many other libraries. A pure-Java version would
never need special handling on any platform, would work on any
platform where JRuby works, and would not require native library
support at all.

I implore gem authors: think about who you might hurt with hard gem
dependencies on native extensions. At least provide an alternative
path.

- Charlie
2f55791ab9018b4d01fb741fab21843d?d=identicon&s=25 Tony Arcieri (Guest)
on 2010-01-24 15:59
(Received via mailing list)
On Sat, Jan 23, 2010 at 10:49 PM, Roger Pack
<rogerpack2005@gmail.com>wrote:

> Looks like pure java Nokogiri is something popular--the bounty on it has
> already risen to $225
>

$250 now :)
34339790bbfb524b877a79d8af706e9c?d=identicon&s=25 Ammar Ali (ammar)
on 2010-01-24 22:55
(Received via mailing list)
Tony Arcieri wrote:
>
Does a pure java anything qualify as a ruby bounty? Or is it a java
bounty now? Maybe a ruby-envy bounty? :)

ammar
2f55791ab9018b4d01fb741fab21843d?d=identicon&s=25 Tony Arcieri (Guest)
on 2010-01-24 22:58
(Received via mailing list)
On Sun, Jan 24, 2010 at 2:55 PM, Ammar Ali <ammarabuali@gmail.com>
wrote:

> Does a pure java anything qualify as a ruby bounty? Or is it a java bounty
> now? Maybe a ruby-envy bounty? :)
>

The goal is a version of Nokogiri without any native code dependencies
which
runs entirely within the JVM.  That doesn't mean it's written in Java or
even necessarily includes any Java code at all: it could be pure Ruby
interfacing with the native Java XML libraries.
F1d37642fdaa1662ff46e4c65731e9ab?d=identicon&s=25 Charles Nutter (headius)
on 2010-01-25 04:34
(Received via mailing list)
On Sun, Jan 24, 2010 at 10:55 PM, Ammar Ali <ammarabuali@gmail.com>
wrote:
> Does a pure java anything qualify as a ruby bounty? Or is it a java bounty
> now? Maybe a ruby-envy bounty? :)

Considering most of the bounties are about C extensions to MRI, you're
about as far off base as you could possibly be. :)

- Charlie
Be30361bb0b0c495e3077db43ad84b56?d=identicon&s=25 Aaron Patterson (Guest)
on 2010-01-25 06:29
(Received via mailing list)
On Sun, Jan 24, 2010 at 09:15:56PM +0900, Charles Oliver Nutter wrote:
> On Sat, Jan 23, 2010 at 11:49 PM, Roger Pack <rogerpack2005@gmail.com> wrote:
> > Looks like pure java Nokogiri is something popular--the bounty on it has
> > already risen to $225
>
> It's probably the most oft-encountered stumbling block for folks using
> JRuby (these days), since Nokogiri itself has become very popular and
> is now depended on by many other libraries. A pure-Java version would
> never need special handling on any platform, would work on any
> platform where JRuby works, and would not require native library
> support at all.

I thought FFI was supposed to solve this problem.  Is it not?

> I implore gem authors: think about who you might hurt with hard gem
> dependencies on native extensions. At least provide an alternative
> path.

I implore Ruby implementors to support the MRI C api, as it too is part
of Ruby's api.  Think about who you hurt by not letting people reuse
valuable libraries written in C.  :-)
Bd6e7860b5a891bff077aeaeb5434e60?d=identicon&s=25 Mike Dalessio (Guest)
on 2010-01-25 07:11
(Received via mailing list)
On Sun, Jan 24, 2010 at 7:15 AM, Charles Oliver Nutter
<headius@headius.com>wrote:

> support at all.
>
> I implore gem authors: think about who you might hurt with hard gem
> dependencies on native extensions. At least provide an alternative
> path.
>

I'm in for $200 for a Java Nokogiri implementation. So is my partner in
crime, Aaron.

Consider it "conscience money" for all the puppies who died as a result
of
us writing and maintaining the JRuby FFI port of Nokogiri. :)
2a745e2d109928984604d0b573e55855?d=identicon&s=25 Phillip Gawlowski (Guest)
on 2010-01-25 08:30
(Received via mailing list)
On 25.01.2010 06:29, Aaron Patterson wrote:

> I implore Ruby implementors to support the MRI C api, as it too is part
> of Ruby's api.

Either provide mswin32 (compiled with VC6), or mingw32 binaries (yes,
binaries, not the source code), or use Java.
2a745e2d109928984604d0b573e55855?d=identicon&s=25 Phillip Gawlowski (Guest)
on 2010-01-25 08:38
(Received via mailing list)
On 25.01.2010 08:29, Phillip Gawlowski wrote:
> On 25.01.2010 06:29, Aaron Patterson wrote:
>
>> I implore Ruby implementors to support the MRI C api, as it too is part
>> of Ruby's api.
>
> Either provide mswin32 (compiled with VC6), or mingw32 binaries (yes,
> binaries, not the source code), or use Java.

"mswin32 *and* mingw32" is what I actually wanted to write. My
apologies.
2f55791ab9018b4d01fb741fab21843d?d=identicon&s=25 Tony Arcieri (Guest)
on 2010-01-25 09:11
(Received via mailing list)
On Sun, Jan 24, 2010 at 10:29 PM, Aaron Patterson <
aaron@tenderlovemaking.com> wrote:

> I thought FFI was supposed to solve this problem.  Is it not?
>

It "solves" it to a certain degree, but when deploying in enterprise
environments it can be quite a hassle.  Think: ancient versions of RHEL
with
ancient versions of libxml/libxslt.  That's not to mention problems with
32-bit vs 64-bit environments: for some reason the 64-bit version of
RedHat
didn't package a 32-bit libxslt, but they did provide a 64-bit one.
However, we originally installed a 32-bit JRE.  Eep!  64-bit JRE is an
"unofficially supported" configuration for our environment (they
typically
run 32-bit JRE on 64-bit RedHat for some reason).

Nokogiri's great and we have everything working now (we had to do some
symlink hackery though).  However, it would definitely streamline the
deployment process a lot if we didn't have native code dependencies, and
right now Nokogiri is the only one we have.
Fa2521c6539342333de9f42502657e5a?d=identicon&s=25 Eleanor McHugh (Guest)
on 2010-01-25 13:37
(Received via mailing list)
On 25 Jan 2010, at 05:29, Aaron Patterson wrote:
> On Sun, Jan 24, 2010 at 09:15:56PM +0900, Charles Oliver Nutter wrote:
>> I implore gem authors: think about who you might hurt with hard gem
>> dependencies on native extensions. At least provide an alternative
>> path.
>
> I implore Ruby implementors to support the MRI C api, as it too is part
> of Ruby's api.  Think about who you hurt by not letting people reuse
> valuable libraries written in C.  :-)

I implore Ruby developers to write in Pure Ruby and demand all these
Ruby Implementors solve their "performance" problems ;p


Ellie

Eleanor McHugh
Games With Brains
http://slides.games-with-brains.net
----
raise ArgumentError unless @reality.responds_to? :reason
F1d37642fdaa1662ff46e4c65731e9ab?d=identicon&s=25 Charles Nutter (headius)
on 2010-01-25 13:38
(Received via mailing list)
On Mon, Jan 25, 2010 at 6:29 AM, Aaron Patterson
<aaron@tenderlovemaking.com> wrote:
> I thought FFI was supposed to solve this problem.  Is it not?

As Tony mentioned, it solves the problem just fine, except that you
still have to have the right libraries available and permission to use
them. Lots of production deployment environments for Java disallow
loading native libraries because they have a much higher likelihood of
crashing the JVM or circumventing its security policies. Google App
Engine, for example, doesn't allow loading native libraries at all,
and of course any small profile Java environments will often be unable
to use native libs as well (like Android).

> I implore Ruby implementors to support the MRI C api, as it too is part
> of Ruby's api.  Think about who you hurt by not letting people reuse
> valuable libraries written in C.  :-)

The MRI C API is terrible if you want to implement it for anything
other than MRI. It allows direct pointer access to object internals,
allows you to *write* to objects directly in memory, has absolutely no
consideration for concurrent execution, no abstraction around object
locations in a relocating GC, and no methods or standards for handling
reference lifecycle (that I am aware of). As I've stated before (and
told you) we'd be happy to support a C API that did not include all
the unsafe bits (like accessing array, string, or hash contents
*directly*) and only allows you access to objects via a handle-based
indirection mechanism. It's just a matter of time and effort to
implement it...and there's a bounty to do so :)

Even then, however, it still wouldn't solve the problems of loading
native libraries on secure environments.

I firmly believe the MRI's C API is the #1 thing holding it back. Why
can't we get a better GC in place? Native extensions. Why can't we
have real concurrent threading? Native extensions. Why is it a pain in
the ass to use MRI on environments without a compiler easily
available, like Windows? Native extensions. When used with plain Java
libraries, JRuby works almost identically across all platforms without
modification, without build tools present, and without segfaults. That
beats supporting the current unsafe C API any day.

- Charlie
Bd6e7860b5a891bff077aeaeb5434e60?d=identicon&s=25 Mike Dalessio (Guest)
on 2010-01-25 18:25
(Received via mailing list)
On Mon, Jan 25, 2010 at 7:37 AM, Charles Oliver Nutter
<headius@headius.com>wrote:

> and of course any small profile Java environments will often be unable
> to use native libs as well (like Android).
>

Charlie, you're making a great case against using FFI.

It sounds like you're recommending that gems containing non-ruby code,
like
Nokogiri, need to be written twice: once in C, and once in Java. Is that
an
accurate interpretation?

(Unless, of course, you take Eleanor's advice and do everything in pure
Ruby. Hello, REXML! :-D )
F1d37642fdaa1662ff46e4c65731e9ab?d=identicon&s=25 Charles Nutter (headius)
on 2010-01-25 19:12
(Received via mailing list)
On Mon, Jan 25, 2010 at 6:25 PM, Mike Dalessio <mike.dalessio@gmail.com>
wrote:
> Charlie, you're making a great case against using FFI.

FFI is much better than writing any C code at all, due to the
security, stability, and portability problems of writing your own C
bindings. If you are permitted to load a given library and that
library is available and you *must* use that library, FFI is the only
logical choice. But it doesn't get around the fact that you need the
library you're binding to be available and loadable on your target
platform. FFI > C bindings, but [platform-independent binary] > FFI.
And that usually means Java-based.

I should also point out that you don't necessarily have to write JVM
libraries in Java; you could also use Scala or Fan or similar
languages, and it would be just as portable (albeit a bit larger due
to the runtime dependency on those languages' runtime libraries).

But yes, at the end of the day, I believe writing stuff in a portable
binary format like JVM bytecode (or CLR bytecode) is a better choice
than writing in a language that has to be recompiled for every target
system. You ought to know that already...would I be working on JRuby
if I believed any differently? :)

And yes...I'd love to be able to recommend that everyone just use Ruby
for everything. But I don't think it's simply a performance issue;
there's some pretty amazing things you can get for free with a rich
static type system.

- Charlie
Be30361bb0b0c495e3077db43ad84b56?d=identicon&s=25 Aaron Patterson (Guest)
on 2010-01-25 19:54
(Received via mailing list)
On Tue, Jan 26, 2010 at 03:12:17AM +0900, Charles Oliver Nutter wrote:
> On Mon, Jan 25, 2010 at 6:25 PM, Mike Dalessio <mike.dalessio@gmail.com> wrote:
> > Charlie, you're making a great case against using FFI.
>
> FFI is much better than writing any C code at all, due to the
> security, stability, and portability problems of writing your own C
> bindings.

References please.

Last I checked, it was just as easy to segv from an FFI library as a C
library.  Plus with FFI you don't get any benefits of compile time
checks.  You can't, for example, check for #define constants.

With FFI you must:

1. Duplicate header files (see below for more problems)
2. Understand struct layouts and the sizeof() for each member
3. Do runtime checking of library features
4. Worry about weak ref maps when using void pointers (see the id2ref
   problem in nokogiri)
5. Pay a runtime conversion price from ruby data types to FFI types
6. Educate users on LD_LIBRARY_PATH
7. Worry about 32bit and 64bit issues (like Tony mentioned)

The duplication of header files becomes an even larger problem if the
library you're wrapping changes it's struct layout.  Where a simple
recompile would have solved the problem, now (without warning) you're
getting
surprising values in your FFI program.  Plus typical debugging tools
like gdb get you nowhere.

Example:

Library "foo" ships with a struct like this:

    struct awesome {
      float hello;
      char * world;
    };

Then later changes to:

    struct awesome {
      char * world;
      float hello;
    };

You wrapped the first one, upgrade the library, then boom.  It doesn't
work.
With a compiled program, you wouldn't care.

Unfortunately, none of the problems I've just listed off are
theoretical.  I have personally run in to every one of them and can
provide you with real world examples.  FFI is awesome for certain,
confined, small, stable use cases.  I use FFI, and I enjoy it.  But
saying that it's "the only logical choice" seems wrong.

I am curious what your experience has been, and why you haven't run in
to the
same problems?  How do other people overcome these issues?
Bd6e7860b5a891bff077aeaeb5434e60?d=identicon&s=25 Mike Dalessio (Guest)
on 2010-01-25 20:15
(Received via mailing list)
On Mon, Jan 25, 2010 at 1:12 PM, Charles Oliver Nutter
<headius@headius.com>wrote:

> platform. FFI > C bindings, but [platform-independent binary] > FFI.
> system. You ought to know that already...would I be working on JRuby
> if I believed any differently? :)
>

I agree with everything you're saying, more or less.

However, none of that relates at all to what I think is the crux of the
issue, which is that everyone writing a non-pure-Ruby gem today is
forced to
choose one of these options:

1) Support nearly everyone by maintaining two ports of your code: FFI
for
JRuby; C for MRI, Rubinius and MacRuby. Don't support GAE.
2) Support everyone by maintaining two ports of your code: JVM for JRuby
and
GAE; C for MRI, Rubinius and MacRuby.
3) Maintain only a single port, FFI, and force everyone on MRI to take a
performance hit of some kind. Oh, and don't support Rubinius, MacRuby or
GAE.
4) Don't support JRuby or GAE. Just write it in C.
5) Don't support MRI, Rubinius, or MacRuby. Just write it for the JVM.

Complicated? Yes. I've summed it all up in a nice matrix here:
http://gist.github.com/286126

I personally think these choices all suck, and I refuse to paint a happy
face on any of them.

We chose option 1 for Nokogiri (you're welcome, intarnets), but everyone
who's writing a gem today has to make this decision for themselves.

My point is that any of these choices contains a tradeoff, and stating
that
one in particular "hurts" people more than another is just disingenuous.
I'd
rather help people understand the tradeoffs.
A0c079a7c3c9b2cf0bffebd84dc578b0?d=identicon&s=25 Chuck Remes (cremes)
on 2010-01-25 20:18
(Received via mailing list)
On Jan 25, 2010, at 1:12 PM, Mike Dalessio wrote:

>> library is available and you *must* use that library, FFI is the only
>> But yes, at the end of the day, I believe writing stuff in a portable
> choose one of these options:
>
> 1) Support nearly everyone by maintaining two ports of your code: FFI for
> JRuby; C for MRI, Rubinius and MacRuby. Don't support GAE.
> 2) Support everyone by maintaining two ports of your code: JVM for JRuby and
> GAE; C for MRI, Rubinius and MacRuby.
> 3) Maintain only a single port, FFI, and force everyone on MRI to take a
> performance hit of some kind. Oh, and don't support Rubinius, MacRuby or
> GAE.
> 4) Don't support JRuby or GAE. Just write it in C.
> 5) Don't support MRI, Rubinius, or MacRuby. Just write it for the JVM.

FFI originated with rubinius, so I would wager that it will work once
the FFI APIs get synched up again. Also, MacRuby has FFI support on its
roadmap. That changes your picture a bit.

cr
Be30361bb0b0c495e3077db43ad84b56?d=identicon&s=25 Aaron Patterson (Guest)
on 2010-01-25 20:31
(Received via mailing list)
On Tue, Jan 26, 2010 at 04:17:32AM +0900, Chuck Remes wrote:
> >> FFI is much better than writing any C code at all, due to the
> >> languages, and it would be just as portable (albeit a bit larger due
> >
> > GAE.
> > 4) Don't support JRuby or GAE. Just write it in C.
> > 5) Don't support MRI, Rubinius, or MacRuby. Just write it for the JVM.
>
> FFI originated with rubinius, so I would wager that it will work once the FFI APIs get 
synched up again. Also, MacRuby has FFI support on its roadmap. That changes your picture 
a bit.

Rubinius implements enough of the MRI C api that it will run Nokogiri
today.  MacRuby will follow suit, and I expect that to happen sooner
than it supports FFI (though this is conjecture).  With minor tweaks to
your C
code, you can have a native extension that runs on all three *today*.
Bd6e7860b5a891bff077aeaeb5434e60?d=identicon&s=25 Mike Dalessio (Guest)
on 2010-01-25 20:35
(Received via mailing list)
On Mon, Jan 25, 2010 at 2:17 PM, Chuck Remes <cremes.devlist@mac.com>
wrote:

> >>
> >> libraries in Java; you could also use Scala or Fan or similar
> > I agree with everything you're saying, more or less.
> > GAE; C for MRI, Rubinius and MacRuby.
> > 3) Maintain only a single port, FFI, and force everyone on MRI to take a
> > performance hit of some kind. Oh, and don't support Rubinius, MacRuby or
> > GAE.
> > 4) Don't support JRuby or GAE. Just write it in C.
> > 5) Don't support MRI, Rubinius, or MacRuby. Just write it for the JVM.
>
> FFI originated with rubinius, so I would wager that it will work once the
> FFI APIs get synched up again. Also, MacRuby has FFI support on its roadmap.
> That changes your picture a bit.
>

If you're interested in helping out in standardizing the FFI specs,
please
subscribe to the ruby-ffi list and offer to help out! We're always
looking
for extra hands, because the specs are not in good shape right now. So
I'm
likely to take your wager. ;)

I stand by the chart as an accurate reflection of the options that
developers are forced to choose from today and for the likely near
future.
A0c079a7c3c9b2cf0bffebd84dc578b0?d=identicon&s=25 Chuck Remes (cremes)
on 2010-01-25 20:53
(Received via mailing list)
On Jan 25, 2010, at 1:34 PM, Mike Dalessio wrote:

> likely to take your wager. ;)
>
> I stand by the chart as an accurate reflection of the options that
> developers are forced to choose from today and for the likely near future.

While it may be true that *some* C extensions work with rubinius and
MacRuby today, I'd say it doesn't matter much in the long term.

For one, Rubinius does not support the entire MRI C API nor will it
ever. Extensions that directly access memory structures are not
supported. FFI is a better long-term choice for Rubinius.

MacRuby is months away from catching up to Rubinius, JRuby or IronRuby
for handling straight ruby code. I don't mean to disparage MacRuby (it
will likely be my go-to-guy for future Cocoa apps) but it ain't ready
for prime time for *ruby* code let alone hooking in C extensions. And
like Rubinius, it won't support all of the MRI C API.

IronRuby does not support any C extensions though it's on the roadmap. I
don't know for certain how extensive their support will be, but I will
*wager* they'll avoid supporting the same elements that Rubinius and
MacRuby are avoiding. :)

So for the likely near future (next 6 months), Rubinius is the only one
that might be able to run a random C extension (as long as it doesn't
use unsafe direct access to memory structures).

I understand what you are saying, truly I do. But I disagree that it is
important to continue building extensions using the C API for the *long*
term. The best way to get FFI firmed up and ready for prime-time is to
port existing extensions to it.

cr
Fa2521c6539342333de9f42502657e5a?d=identicon&s=25 Eleanor McHugh (Guest)
on 2010-01-25 21:51
(Received via mailing list)
On 25 Jan 2010, at 19:12, Mike Dalessio wrote:
> Complicated? Yes. I've summed it all up in a nice matrix here:
> http://gist.github.com/286126
>
> I personally think these choices all suck, and I refuse to paint a happy
> face on any of them.

I have to agree, which is why I mostly seem to end up describing how to
break things via dynamic loading - although I'll admit it's also a lot
of fun :)

Frankly though there is no general case solution which can satisfy all
of the needs of both the Java/Enterprise world and C hackers. Every time
we make the choice to use a third-party library written in anything
other than Ruby as a core dependency of our projects we tie ourselves to
a specific runtime environment as surely as if we were relying on some
custom assembler code and that's just something to accept and move on.

It's maddening, but it's a fact that programmers the world over already
live with on a daily basis.

Last year I spent a fair chunk of time giving lightweight lectures about
Unix abuse from Ruby for those new to the hobby. Many of the techniques
I was keen to demonstrate either won't work on other platforms or do so
unstably, but so what? If I'm writing for a Windows box I already know
that and I'll design things differently.

The same principle applies to JRuby. It *can* run arbitrary C libraries
via FFI if they're present on the target platform but if they're not
it's exactly as stymied as MRI or Rubinius or MacRuby would be in the
same situation. Runtime environment is more than just processor
architecture or operating system and not to take account of that in
deployed code is the fault of the programmer concerned not the team who
developed the runtime implementation.

Now I've often facetiously suggested in this list that all our code
should be developed in Ruby. The main reason I suggest that is that we
often rush to utilise code in other languages without considering its
real as opposed to perceived cost, not only in terms of development
effort and runtime performance but also of longterm maintenance.

Synthetic benchmarks tell us sweet FA about real world performance of
code, architecture being a much more significant consideration than the
proportion of raw MIPS a given language will deliver on a given
platform. The average netbook could happily run all of Teller's fusion
bomb models along with the full telemetry analysis of all the Apollo
missions in the pauses between loading XKCD comics and binning junk mail
without the user being any the wiser.

But architecture is also the primary determinant of how maintainable a
given application will be and whether it'll scale to suit future needs.

The main reason we're not using Ruby for everything is that the
architecture of the reference implementation is a relatively poor match
for the underlying hardware on which our programs run and so a lot of
translation work is being handled automagically (and inefficiently).
Rather than wasting our time arguing over defects we can't fix (such as
not all platforms having access to a given native library) we should be
fixing that core deficit and developing Ruby runtimes that unlock the
level of performance we want from our language. Then more and more
libraries will deliver high performance in pure Ruby and runtime library
issues should become irrelevant.

So far I see most of the work capable of delivering this (such as a
decent abstract Intermediate Language for peep-hole optimisation) coming
from the JRuby team. If the rest of us poured a fraction of the effort
into similar efforts for MRI and other implementations that's expended
on making [FFI|DL|C] API wrappers of existing C libraries then Ruby may
stop being the slow relative of Python and start to compete as what it's
fully capable of being - a systems language.

I have several long rants on this subject that I'll spare anyone who's
not stuck in a bar with me (and is willing to keep the beer flowing, you
know who you are lol) but at the very least Ruby needs: a parallelised
library implementation to seamlessly (i.e. without programmer
intervention) exploit multicore hardware and multithreaded operating
systems; 'unsafe' access to raw memory and kernel event mechanisms for
higher-performance data structures and IO; and a register-based and
JIT-friendly virtual machine so runtime code can be translated to
efficient machine code.

These are the basic architectural building blocks that would make the
need to rely on libraries in C, Java or any other language much rarer.


Ellie

Eleanor McHugh
Games With Brains
http://slides.games-with-brains.net
----
raise ArgumentError unless @reality.responds_to? :reason
Be30361bb0b0c495e3077db43ad84b56?d=identicon&s=25 Aaron Patterson (Guest)
on 2010-01-25 21:58
(Received via mailing list)
On Tue, Jan 26, 2010 at 04:52:13AM +0900, Chuck Remes wrote:
> > If you're interested in helping out in standardizing the FFI specs, please
> > subscribe to the ruby-ffi list and offer to help out! We're always looking
> > for extra hands, because the specs are not in good shape right now. So I'm
> > likely to take your wager. ;)
> >
> > I stand by the chart as an accurate reflection of the options that
> > developers are forced to choose from today and for the likely near future.
>
> While it may be true that *some* C extensions work with rubinius and MacRuby today, I'd 
say it doesn't matter much in the long term.
>
> For one, Rubinius does not support the entire MRI C API nor will it ever. Extensions 
that directly access memory structures are not supported. FFI is a better long-term choice 
for Rubinius.

It doesn't need to support the entire API.  It supports enough of the C
API to get nokogiri running, and believe me, we use a *lot* of the C
API.  Why pay the FFI speed penalty when you can write C code that works
cross implementation?

> MacRuby is months away from catching up to Rubinius, JRuby or IronRuby for handling 
straight ruby code. I don't mean to disparage MacRuby (it will likely be my go-to-guy for 
future Cocoa apps) but it ain't ready for prime time for *ruby* code let alone hooking in 
C extensions. And like Rubinius, it won't support all of the MRI C API.

Again, it doesn't need to support the entire C api.

> IronRuby does not support any C extensions though it's on the roadmap. I don't know for 
certain how extensive their support will be, but I will *wager* they'll avoid supporting 
the same elements that Rubinius and MacRuby are avoiding. :)
>
> So for the likely near future (next 6 months), Rubinius is the only one that might be 
able to run a random C extension (as long as it doesn't use unsafe direct access to memory 
structures).
>
> I understand what you are saying, truly I do. But I disagree that it is important to 
continue building extensions using the C API for the *long* term. The best way to get FFI 
firmed up and ready for prime-time is to port existing extensions to it.

As I pointed out in an earlier email, dealing with FFI wrapped libraries
is
error prone, difficult to debug (not just during development, but also
when
helping people get things installed), doesn't work cross implementation,
requires id2ref (the bane of Charlie's existence. I'm sorry.  :-( ),
etc.  I even have real world examples of *all* of the issues I pointed
out.

Even if FFI were the cross implementation messiah it's supposed to be,
our FFI applications will *still* not work on GAE or Android.  Rubinius
has already proved that you can implement a *subset* of the C API and
get complex extensions to work.  Why can't we run with that?  I think it
would be a better long term solution.  We would get the same "cross
implementation" behavior as FFI, but not have to pay FFI's runtime
conversion penalties.  We also get the ability to do compile time checks
of C library functionality (i.e. check for #defines, function existence,
etc).

People keep saying that FFI is the better way to go, but as someone who
has to support both an FFI version and a C version, I can tell you the
support / development problems with FFI are much more difficult.
A0c079a7c3c9b2cf0bffebd84dc578b0?d=identicon&s=25 Chuck Remes (cremes)
on 2010-01-25 22:20
(Received via mailing list)
On Jan 25, 2010, at 2:56 PM, Aaron Patterson wrote:

> People keep saying that FFI is the better way to go, but as someone who
> has to support both an FFI version and a C version, I can tell you the
> support / development problems with FFI are much more difficult.

1. I have no direct experience using FFI, so my opinions should carry
the appropriate weight. I defer to your real-world experience.

2. I'm not much of a bikeshedder.

I agree with Eleanor. Let's fix the performance deficiencies in the
runtimes and write more code in ruby.

cr
F1d37642fdaa1662ff46e4c65731e9ab?d=identicon&s=25 Charles Nutter (headius)
on 2010-01-26 09:00
(Received via mailing list)
On Mon, Jan 25, 2010 at 7:53 PM, Aaron Patterson
<aaron@tenderlovemaking.com> wrote:
> References please.
>
> Last I checked, it was just as easy to segv from an FFI library as a C
> library.  Plus with FFI you don't get any benefits of compile time
> checks.  You can't, for example, check for #define constants.

Code you don't write can't cause a segfault. FFI allows you to write
less C, and from my experience the more C code you write the more
likely you are to blow something up. FFI certainly doesn't protect you
from other possible segfaults, like calling into libraries incorrectly
or defining bad struct sizes or mismanaging memory, but it is at least
less C code to write and maintain.

I will grant there's a lot of up-front cost required (currently) that
may make it no easier than maintaining all that C code.

> With FFI you must:
>
> 1. Duplicate header files (see below for more problems)
> 2. Understand struct layouts and the sizeof() for each member
> 3. Do runtime checking of library features
> 4. Worry about weak ref maps when using void pointers (see the id2ref
>   problem in nokogiri)
> 5. Pay a runtime conversion price from ruby data types to FFI types
> 6. Educate users on LD_LIBRARY_PATH
> 7. Worry about 32bit and 64bit issues (like Tony mentioned)

Yeah, I will admit there's more hassle using FFI than there should be.
I don't know how to address that, but projects like ffi-inliner seem
to be a step in the right direction. FFI-inliner basically allows you
to have some embedded C code in your FFI-consuming library that it
then compiles and links in via FFI. That allows you to get the
compile-time tooling you want for wrangling nontrivial structs while
still supporting any implementation that supports FFI. You lose the
ability to run on platforms without a compiler available (though it
does some wrangling with tcc, I believe), but it may be a good happy
medium. What do you think?

I don't want to give the impression that you shouldn't use C tooling
to call a C library, or even that nobody should ever write C code. I
just believe that everyone writing C code that depends on MRI's C API
is a dead end.
> Unfortunately, none of the problems I've just listed off are
> theoretical.  I have personally run in to every one of them and can
> provide you with real world examples.  FFI is awesome for certain,
> confined, small, stable use cases.  I use FFI, and I enjoy it.  But
> saying that it's "the only logical choice" seems wrong.

I'll restate it: using mechanisms for binding C libraries that don't
depend on MRI's C API is the only logical choice. FFI certainly isn't
perfect, but it's the best option for doing that right now.

> I am curious what your experience has been, and why you haven't run in to the
> same problems?  How do other people overcome these issues?

We certainly have run into some of those issues, most notably when
trying to support "stat" calls from JRuby across all platforms. Our
only option has been to rewire the struct and call for each platform
we intend to run on. It sucks, I agree. But we support stat on all
those platforms out of a single JRuby distribution without a recompile
being necessary. That's pretty cool.

- Charlie
F1d37642fdaa1662ff46e4c65731e9ab?d=identicon&s=25 Charles Nutter (headius)
on 2010-01-26 09:07
(Received via mailing list)
On Mon, Jan 25, 2010 at 8:12 PM, Mike Dalessio <mike.dalessio@gmail.com>
wrote:
> 3) Maintain only a single port, FFI, and force everyone on MRI to take a
>
> We chose option 1 for Nokogiri (you're welcome, intarnets), but everyone
> who's writing a gem today has to make this decision for themselves.
>
> My point is that any of these choices contains a tradeoff, and stating that
> one in particular "hurts" people more than another is just disingenuous. I'd
> rather help people understand the tradeoffs.

Yeah, I agree all the choices have various levels of suck. Being a JVM
guy I'd love to just tell everyone to "write it in Java", since
there's practically no cross-platform challenges in that case (and
don't anyone start telling me about how bad some Swing app is at
working across platforms; you're digging in the wrong place and the
JVM has a stellar cross-platform record when it comes to plain old
libraries). But that obviously doesn't solve the larger problem of
writing extensions or binding libraries in ways that all Ruby
implementations can support.

I'm nothing if I'm not pragmatic. I fully recognize that FFI is a real
pain in the ass to wire up for anything nontrivial, especially if you
have the issues Aaron talked about with struct layout and memory
management, and I sympathize. I'm also extremely grateful to all the
library authors who have swallowed that pill in order to support
JRuby. We're ready and willing to find ways to support extension
writers better, be it through ffi-inliner, a safe C API subset, or
simply helping to find maintainers for JVM-based (i.e. no native code)
ports of key libraries.

And to Aaron: I do apologize for being so gruff about id2ref. We'd had
it disabled on master for several months without any reports of
trouble; Nokogiri just ended up being the first lucky customer.
Hopefully you've been able to find a better way, like maintaining your
own table or using the WeakHash that Evan mocked up. If not, I stand
ready to help find another solution.

- Charlie
2f55791ab9018b4d01fb741fab21843d?d=identicon&s=25 Tony Arcieri (Guest)
on 2010-01-26 09:24
(Received via mailing list)
On Mon, Jan 25, 2010 at 5:36 AM, Eleanor McHugh <
eleanor@games-with-brains.com> wrote:

> I implore Ruby developers to write in Pure Ruby and demand all these Ruby
> Implementors solve their "performance" problems ;p
>

The problem with this attitude is that you eschew some great, robust
libraries that are already out there that solve complex problems.
Parsing
XML is a bitch.  Fortunately, there are already some great libraries to
do
this.  There's the libxml2 library, which Nokogiri uses, and Java ships
with
some great XML libraries to.

Will we ever see a pure Ruby library as robust and powerful as these
(all
performance considerations aside)?  REXML certainly isn't there yet.  Is
it
really worth writing a library in pure Ruby when robust libraries
already
exist that Ruby can tap into?
F1d37642fdaa1662ff46e4c65731e9ab?d=identicon&s=25 Charles Nutter (headius)
on 2010-01-26 09:31
(Received via mailing list)
On Mon, Jan 25, 2010 at 9:56 PM, Aaron Patterson
<aaron@tenderlovemaking.com> wrote:
> On Tue, Jan 26, 2010 at 04:52:13AM +0900, Chuck Remes wrote:
>> For one, Rubinius does not support the entire MRI C API nor will it ever. Extensions 
that directly access memory structures are not supported. FFI is a better long-term choice 
for Rubinius.
>
> It doesn't need to support the entire API.  It supports enough of the C
> API to get nokogiri running, and believe me, we use a *lot* of the C
> API.  Why pay the FFI speed penalty when you can write C code that works
> cross implementation?

I'd like to understand how much of a speed penalty we actually pay
using FFI. It's worth pointing out that Rubinius has had to implement
some pretty nasty (as in tricky, difficult, and potentially a lot
slower than MRI's "raw" memory access) logic in order to support their
current subset of the MRI C API. They've chosen to try to support APIs
I would never dream of like RARRAY and other direct pointer access,
and in many cases they have to do it by copying around a lot more data
than MRI does. And that's life, sucky though it is, if you want to
support enough of the C API to run real-world extensions right now.
I'm sure Evan can describe how they handle those APIs better than I
can.

I do believe there's a subset of APIs that could be supported across
implementations without a major perf penalty if these points (and
probably others) were addressed:

* No direct access to object internals without explicitly copying in
and out yourself (i.e. you have to opt-in to the copying penalty)
* Additional APIs to make object access and manipulation easier (like
APIs for copying or doing bulk writes into array contents)
* Additional APIs for lifecycle management (hard and weak references
and functions for acquiring and releasing such references)

I'd love to hear from the other implementers about what they think
they'd be able to support of the C API.

The example set by JNI might help us figure out the safe subset and
enhancements needed. JNI, for all its warts, does a very good job of
isolating native code from JVM internals. You can't get direct
pointers to anything, you need to manage reference lifecycles
appropriately, you need to copy data in and out yourself if the object
accessor functions don't do what you need. It's not a pretty API,
granted, but in the 15 years the JVM has been mainstream that API has
changed very little.

> Even if FFI were the cross implementation messiah it's supposed to be,
> our FFI applications will *still* not work on GAE or Android.  Rubinius
> has already proved that you can implement a *subset* of the C API and
> get complex extensions to work.  Why can't we run with that?  I think it
> would be a better long term solution.  We would get the same "cross
> implementation" behavior as FFI, but not have to pay FFI's runtime
> conversion penalties.  We also get the ability to do compile time checks
> of C library functionality (i.e. check for #defines, function existence, etc).

I'll say it again: The Rubinius folks have done an admirable job of
implementing the large subset that they do. And given the target
audience for Rubinius, they may not have any other choice. But there's
some pretty large tradeoffs required to get that subset
working...tradeoffs that in some cases might make binding to the C API
a lot slower than using something like FFI. It has also required a
herculean effort to support that subset given the (good) design
choices Evan made (like having accurate GC that moves objects around
in memory). Expecting all implementations to put in that effort is
pretty close to absurdity; consider that JRuby only recently really
started to feel "compatible" enough that we don't spend every day, all
day fixing Ruby core class bugs.

JRuby has had a continuous stream of about 3.5 bug reports per day,
every day, for over three years...and out of the 4500-some filed bugs,
we manage to keep our unresolved count around 500. That has required
fulltime effort from at least two of us (Tom Enebo and I) and
part-time help from dozens of contributors. The benefits of supporting
a C API subset just don't warrant the effort we would personally have
to put in and the sacrifices that would result. We need help. :(

- Charlie
F1d37642fdaa1662ff46e4c65731e9ab?d=identicon&s=25 Charles Nutter (headius)
on 2010-01-26 09:40
(Received via mailing list)
On Tue, Jan 26, 2010 at 9:17 AM, Tony Arcieri <tony@medioh.com> wrote:
> The problem with this attitude is that you eschew some great, robust
> libraries that are already out there that solve complex problems.  Parsing
> XML is a bitch.  Fortunately, there are already some great libraries to do
> this.  There's the libxml2 library, which Nokogiri uses, and Java ships with
> some great XML libraries to.
>
> Will we ever see a pure Ruby library as robust and powerful as these (all
> performance considerations aside)?  REXML certainly isn't there yet.  Is it
> really worth writing a library in pure Ruby when robust libraries already
> exist that Ruby can tap into?

<rant>
It's probably also worth pointing out that various folks in the Ruby
community have continually panned anyone having any association with
Java. Hell, at my first ever JRuby talk in San Diego, I was openly
mocked by other presenters. And I still see other Java/JVM users get
the same treatment, both on these lists, at conference talks, and in
the hallway track. Apparently MINASWAN doesn't apply to folks using
Java or the JVM. :(

Unfortunately, it's exactly those Java folks that could help
accelerate Ruby adoption *and* help maintain Java/JVM versions of key
native libraries like Nokogiri or RMagick. If we did more to embrace
JVM users, rather than insulting them for using a different tool,
maybe extension writers would have more help supporting JRuby (and the
same goes for other managed runtimes like .NET/CLR).

The Ruby world shouldn't be a "C hackers only" club. Native extensions
tend to make it so.
</rant>

Rant aside...I really do want to make it easier to support JRuby,
regardless of whether folks need C or Java. Tell me what needs to be
done and help me find resources to do it :)

- Charlie
Fa2521c6539342333de9f42502657e5a?d=identicon&s=25 Eleanor McHugh (Guest)
on 2010-01-26 16:06
(Received via mailing list)
On 26 Jan 2010, at 08:17, Tony Arcieri wrote:
> this.  There's the libxml2 library, which Nokogiri uses, and Java ships with
> some great XML libraries to.

Don't get me wrong, I enjoy low-level munging as much as the next
hacker. But given the choice between scripting libraries written in C
and having Ruby performance comparable to C I'd take the latter every
time.

> Will we ever see a pure Ruby library as robust and powerful as these (all
> performance considerations aside)?  REXML certainly isn't there yet.  Is it
> really worth writing a library in pure Ruby when robust libraries already
> exist that Ruby can tap into?

The same argument applies to anything new. Why replace something which
appears perfectly suited to a given task with a new, shiny, probably
flawed and ill-conceived alternative? Because that's how we get better
tools than the ones we currently have and are able to tackle new tasks
that our existing understanding fails to even identify.


Ellie

Eleanor McHugh
Games With Brains
http://slides.games-with-brains.net
----
raise ArgumentError unless @reality.responds_to? :reason
Please log in before posting. Registration is free and takes only a minute.
Existing account

NEW: Do you have a Google/GoogleMail, Yahoo or Facebook account? No registration required!
Log in with Google account | Log in with Yahoo account | Log in with Facebook account
No account? Register here.