Nokogiri 1.0.5 Released


#1

nokogiri version 1.0.5 has been released!

Nokogiri (鋸) is an HTML, XML, SAX, and Reader parser with CSS3 and
XPath search support.

Changes:

1.0.5

  • Bugfixes

    • Added mailing list and ticket tracking information to the README.txt
    • Sets ENV[‘PATH’] on windows if it doesn’t exist
    • Caching results of NodeSet#[] on Document

== FEATURES:

  • XPath support for document searching
  • CSS3 selector support for document searching
  • XML/HTML builder
  • Drop in replacement for Hpricot (though not bug for bug)

Nokogiri parses and searches XML/HTML very quickly, and also has
correctly implemented CSS3 selector support as well as XPath support.

Here is a speed test:

Nokogiri also features an Hpricot compatibility layer to help ease the
change
to using correct CSS and XPath.

== SUPPORT:

The Nokogiri mailing list is available here:

The bug tracker is available here:

== SYNOPSIS:

require ‘nokogiri’
require ‘open-uri’

doc =
Nokogiri::HTML(open(‘http://www.google.com/search?q=tenderlove’))

Search for nodes by css

doc.css(‘h3.r a.l’).each do |link|
puts link.content
end

Search for nodes by xpath

doc.xpath(’//h3/a[@class=“l”]’).each do |link|
puts link.content
end

Or mix and match.

doc.search(‘h3.r a.l’, ‘//h3/a[@class=“l”]’).each do |link|
puts link.content
end


#2

hi.

Is nested search supported?

For example,

xml =
“subbody1subbody2”
doc.xpath("//body").each do |body|
puts body.at(“subbody”).content
end

I expected to see

subbody1
subbody2

but seems it always use the root element.

On Thu, Nov 13, 2008 at 2:40 PM, Aaron P.
<removed_email_address@domain.invalid


#3

On Thu, Nov 13, 2008 at 05:44:50PM +0900, Dingding Ye wrote:

end

I expected to see

subbody1
subbody2

but seems it always use the root element.

Yes. Try using an xpath search as such:

xml =
“subbody1subbody2”

Nokogiri::XML(xml).xpath("//body").each do |body|
puts body.at(".//subbody").content
end


#4

On Nov 13, 3:44 am, Dingding Ye removed_email_address@domain.invalid wrote:

doc.xpath("//body").each do |body|
puts body.at(“subbody”).content
end

I expected to see

subbody1
subbody2

but seems it always use the root element.

Let me try to clear up the confusion.

at() is from Hpricot, and it searches from the root node. It is
typically used as doc.at(‘element’)

search() is also from Hpricot, and it searches from the current node.
This is perhaps the behavior you were expecting. An alias is ‘/’, as
in doc/“html/body”

xpath() is like find() from libxml and you can nest those calls too.
Note that the Hpricot xpath(), which returns the path to the node, is
available in Nokogiri as path().

– Mark.


#5

Great work!