I tried to reply to this via the ruby-talk mailing list and it didn’t
work. Not sure why not, maybe someone can fill me in on that. Anyway,
here’s my take:
To start, the rdoc documentation can be found at
http://libxml.rubyforge.org/rdoc/index.html. Now I don’t know this for
doesn’t look like a real doctype definition, so if you can pull it out
of your xml (by hand, not programmatically) before trying to parse it,
I’d say that would be a good idea. That being said, there are two
attributes of the XML::Parser class that look like they may be of
interest: default_load_external_dtd and default_validity_checking. Try
setting both of those to false, unless you have a real dtd to validate
against and the example above was fake. Of course, since this is using
XML::Parser instead of XML::Document I think you would need to do e.g.:
parser = XML::Parser.file()
parser.default_load_external_dtd = false
parser.default_validity_checking = false
doc = parser.parse
… and then go from there.
ruud grosmann wrote:
Is using libxml the right thing to do to, or are there smarter alternatives?
Libxml-ruby is the most complete & accurate parser of the big three
Libxml-ruby, and Hpricot), and its documentation can be very
much of the original C Libxml documentation have you been able to read?