Hpricot and path of an elememt

Hi all,

I use hpricot to load a page. Then I try to find the path for an
element “font”() in the page. Here is
the tutorial
(http://code.whytheluckystiff.net/hpricot/wiki/HpricotBasics):

doc.at("#header").xpath
#=> “//div[@id=‘header’]”

here is my code:
puts doc.at("#font").xpath

When I run the code Ruby complains undefined method for xpath. I wonder
if I have problem understanding the tutorial.

Thanks,

Li

On Sunday 10 August 2008 13:36:42 Li Chen wrote:

I use hpricot to load a page. Then I try to find the path for an
element “font”() in the page.

So, you probably want:

(doc / ‘font’)

doc.at("#header").xpath
#=> “//div[@id=‘header’]”

Right, that’s searching for a tag that looks like this:

here is my code:
puts doc.at("#font").xpath

And that’s searching for a tag that looks like this:

If you’re following that example, you probably want:

puts doc.at(‘font’).xpath

Now, first question: Why do you need the xpath? Usually, the idea is to
try to
find that element, and then do something with it. So, for example:

To return all text:

(doc / ‘font’).text

To loop over each font element:

(doc / ‘font’).each { |tag|
puts tag.inner_text
}

Second question: Why is there a font tag on this page? If you had any
hand in
creating the page, shame on you – go learn some CSS.

In fact, go learn some CSS anyway. Hpricot supports both CSS selectors
and
XPath, and it’s usually much easier to use the selectors. Years later, I
still remember, roughly, how selectors work – but only a few months
later,
I’ve almost completely forgotten XPath.

There are things XPath can do that selectors can’t. But until you
encounter
them, XPath is overkill.

David M. wrote:

Now, first question: Why do you need the xpath? Usually, the idea is to
try to
find that element, and then do something with it. So, for example:

To return all text:

(doc / ‘font’).text

To loop over each font element:

(doc / ‘font’).each { |tag|
puts tag.inner_text
}

I need to extract text within this tag. I follow you code and I find

  1. (doc/‘font’).text and (doc/‘font’).html return the same results
  2. when I run (doc / ‘font’).each { |tag| puts tag.inner_text}
    Ruby complains it:
    undefined method `inner_text’ for #Hpricot::Elem:0x2e9f9c4
    (NoMethodError)

so I change it to tag.inner_html and it works. I check the document
about hpricot and find the methode #inner_text is there. But I cannot
figure out why Ruby complains about it.

Second question: Why is there a font tag on this page? If you had any
hand in
creating the page, shame on you – go learn some CSS.

I am a newbie on HTML and website development. If you want to know why
there is a font tag in the page, please check this out:
http://www.ensembl.org/Homo_sapiens/exonview?db=core;transcript=ENST00000356766

What I try to do is to extract some info I am interested from this
page. I have no idea why they put this tag and that tag there. I don’t
think it is my priority to know somany whys now. I am more concerned
about letting the job done.

Anyway thank very much for the tips.

Li

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs