Forum: Ruby Nokogiri sax parser error

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Thomas S. (Guest)
on 2009-02-08 17:23
(Received via mailing list)
Trying to use the Nokogiri Sax parser. I ran into a problem parsing
the following:

  <html>
    <head>
    </head>
    <body>
      <try x="<x/>" />
    </body>
  </html>

It chokes on the x="<x/>" attribute.

T.
Mike C. (Guest)
on 2009-02-08 17:27
(Received via mailing list)
That's because it's not valid xml (or html).  You'd need to escape the
< and >

<try x="&lt;x/&gt;: />
Mike C. (Guest)
on 2009-02-08 17:28
(Received via mailing list)
Oops typo...

> <try x="&lt;x/&gt;" />
Aaron P. (Guest)
on 2009-02-09 02:18
(Received via mailing list)
On Mon, Feb 09, 2009 at 12:20:52AM +0900, Trans wrote:
>
> It chokes on the x="<x/>" attribute.

Can you be more specific?  Are you using the XML SAX parser, or the HTML
SAX parser?  What version of libxml2 do you have?

I tried this document with the HTML SAX parser, and it seemed to handle
it just fine.
Thomas S. (Guest)
on 2009-02-09 22:53
(Received via mailing list)
On Feb 8, 7:16 pm, Aaron P. <removed_email_address@domain.invalid> wrote:

> Can you be more specific?  Are you using the XML SAX parser, or the HTML
> SAX parser?  What version of libxml2 do you have?
>
> I tried this document with the HTML SAX parser, and it seemed to handle
> it just fine.

Ah that was it then. I was using the XML SAX parser and I did not
realize that < > had to be escaped in attributes. What was happing was
the document would basically get cut off after it reached the
attribute.

The HTML SAX parser handled it without a problem though.

My LibXML version is 2.6.31, btw.

Thanks,
T.
This topic is locked and can not be replied to.