Xml parser + dom + robust + speed =?

Hi,

can somebody suggest a good parser that offers:

  • multiple encodings
  • production-like quality
  • dom tree manipulation
  • good performance

is REXML the only choice or are there others?

Cheers
Peter

Pete wrote:

Hi,

can somebody suggest a good parser that offers:

  • multiple encodings
  • production-like quality
  • dom tree manipulation
  • good performance

is REXML the only choice or are there others?

There’s also libxml and probably others:
http://www.google.com/search?q=ruby+xml

However, note that DOM and performance nearly automatically exclude each
other. Depending on what you do stream parsing is almost always faster
and uses less memory.

Kind regards

robert

If you are working under windows, it seems to be difficult to get some
of the other XML libraries to compile. Like the previous poster said,
libXml should be pretty speedy. Rexml is the only pure ruby library
though (there are some built on top of rexml).

Try parsing a document, and see how the performance turns out, if it
really is too slow for you try the following things:

  1. Avoid using the xpath lookups if you don’t need them. They’re the
    most natural way to access xml data in my opinion, but they are a bit
    slower
  2. Start using the stream parser and your DOM tree yourself with
    lighterweight nodes, if you need an examle let me know, I can post one.

Enjoy,
.adam sanderson