Nokogiri 1.0.6 Released

aaronpowell · November 17, 2008, 6:05am

nokogiri version 1.0.6 has been released!

Nokogiri (é‹¸) is an HTML, XML, SAX, and Reader parser.

Changes:

FEATURES:

Nokogiri parses and searches XML/HTML very quickly, and also has
correctly implemented CSS3 selector support as well as XPath support.

Here is a speed test:

Nokogiri also features an Hpricot compatibility layer to help ease the
change
to using correct CSS and XPath.

The Nokogiri mailing list is available here:

The bug tracker is available here:

require ‘nokogiri’
require ‘open-uri’

doc =
Nokogiri::HTML(open(‘tenderlove - Google Search’))

doc.css(‘h3.r a.l’).each do |link|
puts link.content
end

doc.xpath(‘//h3/a[@class=“l”]’).each do |link|
puts link.content
end

doc.search(‘h3.r a.l’, ‘//h3/a[@class=“l”]’).each do |link|
puts link.content
end

aaronpowell · November 17, 2008, 5:21pm

Can the Reader interface do stream parsing à la StaX? I couldn’t tell
from the docs.

thanks,
– Mark.

aaronpowell · November 18, 2008, 4:23pm

On Tue, Nov 18, 2008 at 01:16:51AM +0900, Mark T. wrote:

Can the Reader interface do stream parsing à la StaX? I couldn’t tell
from the docs.

Not yet. The normal doc parser will do streams right now. SAX/Reader
stream parsing is next on my list.