Nokogiri bug?

Hi Folks - Having some trouble w/ Nokogiri & its handling of xpaths and
nodesets, wondering if anyone here knows about possible existing issues.
(Apologies if this is a bit off-topic, but I am working in Ruby…)

So I’m trying to access a form’s action URL so that I can construct my
own query string to GET via mechanize. I’ve tested just about every
possible xpath to get to the form’s action property, and Nokogiri is
behaving as if it doesn’t exist.

What I’ve discovered is that, once I hit a particular div tag, which
happens to be the parent node of the form, Nokogiri doesn’t see any of
the div’s children.

The page has HTML to this effect:

Note the form doesn’t have an ID or class by which to directly reference
it. Trying to access it like so:

sublnk = page.xpath("//div[@id=‘app-#{code}’]")

If I call children() on that, it’s empty. But the parent of this node
works fine, and Nokogiri constructs its children properly, including
containing the form as a child of “app-#{code}”:

sublnk =
page.xpath("//div[@id=‘page-content’]/div[@id=‘summary-section’]").children

Additionally, accessing the form directly via its index won’t work
either. It’s the second form on the page, so I’ve tried:

sublnk = page.xpath("//form[2]/@action")

…and still get nothing. It’s as if this node just doesn’t exist to
Nokogiri, even though it clearly does when it’s a child. So am I doing
something wrong? How come //div[@id='app-#{code}] doesn’t have any
children? Is there a bug in Nokogiri which prevents it from accessing
the form node? Why won’t any xpath to the form work?!? Please help me
stop banging my head on my desk.

Thanks!

-Alex

Hi,

On Wed, Aug 18, 2010 at 3:58 PM, Alex S. [email protected] wrote:

Hi Folks - Having some trouble w/ Nokogiri & its handling of xpaths and
nodesets, wondering if anyone here knows about possible existing issues.
(Apologies if this is a bit off-topic, but I am working in Ruby…)

A better place to discuss Nokogiri-specific questions like this is
probably
the nokogiri-talk mailing list, which you can find more about here:

http://groups.google.com/group/nokogiri-talk

So I’m trying to access a form’s action URL so that I can construct my
own query string to GET via mechanize. I’ve tested just about every
possible xpath to get to the form’s action property, and Nokogiri is
behaving as if it doesn’t exist.

When you email the nokogiri-talk list, please make sure you include the
entire HTML document you’re dealing with (e.g., as a gist or pastie), so
that someone can first reproduce your issue and then diagnose it or give
you
some advice.

Cheers,
-mike

Thanks, I thought there might be a more appropriate forum.

But fortunately I won’t have to post there - figured it out.

To find & verify the xpaths, I’d been using firebug. Then looking in
the code I noted that firebug’s model wasn’t matching the actual HTML,
which in turn didn’t match what nokogiri saw - they each showed
something different (bug elsewhere?). Once I figured out what HTML
nokogiri was parsing, it was easy.

Thanks,
Alex

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs