Should Nokogiri replace REXML?

This is probably more suited to the ruby-core mailing list, but as I
am not following that list regularly any longer I’ll bring it up here.

REXML is starting to look pretty dated. I find myself no longer using
REXML for anything and use Nokogiri instead. I suspect other
developers are doing the same. Aaron (and Mike) have done an
incredible job with Nokogiri. And so I think it’s not unreasonable to
suggest that it replace REXML as a standard library.

The only downside I see that Nokogiri is not a pure Ruby library, but
depends on libxml2. But given the advantages, speed, and uptake of
Nokogiri, I would not expect that to be any sort of show-stopper.

I have always thought a good XML library was important to Ruby. I kept
libxml-ruby on life support for many years hoping someone would
eventually come along and carry on development (it was the best I
could do not being a C coder). That did happen eventutally and we can
thank Dan and Charlie for all their hard work for making libxml-ruby
an excellent library, and of course we should thank Sean who started
the project.

But Aaron came along and upped the ante with Nokogiri.

So any way. I’ve had this thought in the back of my mind for a while,
and just wanted to put it out there.

Wouldn’t that make it really hard for the non C-based Ruby
implementations?

On Jan 21, 4:17 pm, Jordi B. [email protected] wrote:

Wouldn’t that make it really hard for the non C-based Ruby implementations?

Well, Nokogiri was ported to JRuby using FFI. Which means can work on
MacRuby and MagLev possibly.

http://github.com/tenderlove/nokogiri/tree/java

Dunno it’s status but seems pretty doable.

On Thu, Jan 21, 2010 at 1:17 PM, Jordi B. [email protected]
wrote:

Wouldn’t that make it really hard for the non C-based Ruby implementations?

Yes, it would. But if someone wants to help implement the remaining
bits of the pure-Java Nokogiri, we’ll be pretty close in JRuby.

http://www.serabe.com/2009/12/31/helping-nokogiri-take-ii/

Unfortunately libxml encompasses a lot of functionality not
typically included in the many Java XML parser (like bad HTML
scrubbing), so including all of Nokogiri would introduce a lot of
dependencies. Ideally I’d like to see a Nokogiri “lite” that just
provides the W3C APIs for DOM, SAX, and pull parsing, and allows you
to pull in “Nokogiri HTML” or some other library for doing HTML
scrubbing.

  • Charlie

On Thu, Jan 21, 2010 at 2:31 PM, Luis L. [email protected]
wrote:

On Jan 21, 4:17Â pm, Jordi B. [email protected] wrote:

Wouldn’t that make it really hard for the non C-based Ruby implementations?

Well, Nokogiri was ported to JRuby using FFI. Which means can work on
MacRuby and MagLev possibly.

http://github.com/tenderlove/nokogiri/tree/java

Dunno it’s status but seems pretty doable.

FFI is great, but it’s only usable where you have libxml available (a
problem for the C ext as well) and where you are allowed to use it (a
big problem for JRuby users on Google App Engine, Android, or secure
environments where native libraries are forbidden).

The only perfect solution for JRuby is a pure-Java Nokogiri.

  • Charlie