Nokogiri help

I am trying to access some particular children in a document. I find
myself having to loop (several levels) through children checking name ==
“xxx”. I am wondering whething there is more direct way of getting the
same. Here’s a simple sample.

I am trying to get the href of a class called prevnext (on browser it is
“next”).

require ‘nokogiri’
require ‘open-uri’

doc = Nokogiri::HTML(open(‘Reddit - Dive into anything’))
nextlink = nil
doc.css(‘p.nextprev’).each do |link|
link.children.each do |child|
if child.name == “a”
nextlink = child.attributes[“href”].value
end
end
end
puts nextlink


Can i not access a.href more directly in the doc.css line itself ?
thx, rahul

On Sun, Aug 8, 2010 at 8:35 AM, R… Kumar 1.9.1 OSX
[email protected] wrote:

I am trying to access some particular children in a document. I find
myself having to loop (several levels) through children checking name ==
“xxx”. I am wondering whething there is more direct way of getting the
same. Here’s a simple sample.

I am trying to get the href of a class called prevnext (on browser it is
“next”).

doc.css(‘p.nextprev’).each do |link|

doc.css(‘p.nextprev/a’).each do |a|

On Sun, Aug 8, 2010 at 11:35 AM, R… Kumar 1.9.1 OSX
<[email protected]

wrote:

puts nextlink


Can i not access a.href more directly in the doc.css line itself ?
thx, rahul

For advanced usage like capturing attribute values in a Nokogiri search,
you
should use Node#xpath:

doc = Nokogiri::HTML(open('http://www.reddit.com/r/programming/'))
puts doc.xpath("//p[@class='nextprev']/a/@href").inspect
# => [#<Nokogiri::XML::Attr:0x3fa59149a0cc name="href" value="

http://www.reddit.com/r/programming/?count=25&after=t3_cygpc">]

Mike D. wrote:

On Sun, Aug 8, 2010 at 11:35 AM, R… Kumar 1.9.1 OSX
<[email protected]

wrote:

For advanced usage like capturing attribute values in a Nokogiri search,
you
should use Node#xpath:

doc = Nokogiri::HTML(open('http://www.reddit.com/r/programming/'))
puts doc.xpath("//p[@class='nextprev']/a/@href").inspect
# => [#<Nokogiri::XML::Attr:0x3fa59149a0cc name="href" value="

http://www.reddit.com/r/programming/?count=25&after=t3_cygpc">]

thanks a lot to both answers.

  1. Is there any doc that has advanced info or examples, the tutorial had
    simple examples.

  2. Also, is Nokogiri the right tool for this job, or is Scrubyt more
    suitable.

reg
rahul

On Mon, Aug 09, 2010 at 01:02:33PM +0900, R… Kumar 1.9.1 OSX wrote:

doc = Nokogiri::HTML(open('http://www.reddit.com/r/programming/'))
puts doc.xpath("//p[@class='nextprev']/a/@href").inspect
# => [#<Nokogiri::XML::Attr:0x3fa59149a0cc name="href" value="

http://www.reddit.com/r/programming/?count=25&after=t3_cygpc">]

thanks a lot to both answers.

  1. Is there any doc that has advanced info or examples, the tutorial had
    simple examples.

Nokogiri follows the XPath 1.0 spec. That means any XPath knowledge you
gain can be used with Nokogiri. I suggest starting here:

http://www.w3schools.com/xpath/

Once you’ve mastered that, then I would go here:

xpath cover page

  1. Also, is Nokogiri the right tool for this job, or is Scrubyt more
    suitable.

Nokogiri will probably work well for you, but I am not an expert with
Scrubyt. You may want to investigate.