nokogiri version 1.2.3 has been released!
Nokogiri (鋸) is an HTML, XML, SAX, and Reader parser.
Changes:
1.2.3 / 2009-03-22
-
Bugfixes
- Fixing bug where a node is passed in to Node#new
- Namespace should be assigned on DocumentFragment creation. LH #66
- Nokogiri::XML::NodeSet#dup works GH #10
- Nokogiri::HTML returns an empty Document when given a blank string
GH#11
- Adding a child will remove duplicate namespace declarations LH #67
- Builder methods take a hash as a second argument
FEATURES:
- XPath support for document searching
- CSS3 selector support for document searching
- XML/HTML builder
- Drop in replacement for Hpricot (though not bug for bug)
Nokogiri parses and searches XML/HTML very quickly, and also has
correctly implemented CSS3 selector support as well as XPath support.
Here is a speed test:
Nokogiri also features an Hpricot compatibility layer to help ease the
change
to using correct CSS and XPath.
SUPPORT:
The Nokogiri mailing list is available here:
The bug tracker is available here:
SYNOPSIS:
require ‘nokogiri’
require ‘open-uri’
Get a Nokogiri::HTML:Document for the page we’re interested in…
doc =
Nokogiri::HTML(open(‘tenderlove - Google Search’))
Do funky things with it using Nokogiri::XML::Node methods…
Search for nodes by css
doc.css(‘h3.r a.l’).each do |link|
puts link.content
end
Search for nodes by xpath
doc.xpath(‘//h3/a[@class=“l”]’).each do |link|
puts link.content
end
Or mix and match.
doc.search(‘h3.r a.l’, ‘//h3/a[@class=“l”]’).each do |link|
puts link.content
end
REQUIREMENTS:
- ruby 1.8 or 1.9
- libxml2
- libxml2-dev
- libxslt
- libxslt-dev
INSTALL:
When I do “gem install nokogiri” I get:
Building native extensions. This could take a while…
ERROR: Error installing nokogiri:
ERROR: Failed to build gem native extension.
/usr/bin/ruby extconf.rb install nokogiri
checking for #include <iconv.h>
… yes
checking for #include <libxml/parser.h>
… no
libxml2 is missing. try ‘port install libxml2’ or ‘yum install libxml2’
but when I try and do “yum install libxml2” I get:
Package libxml2-2.7.3-1.fc10.i386 already installed and latest version
How can I fix this?
Thanks,
Phil.
Aaron P. wrote:
Changes:
- Builder methods take a hash as a second argument
end
puts link.content
INSTALL:
–
Philip R.
GPO Box 3411
Sydney NSW 2001
Australia
E-mail: [email protected]
On Mon, Mar 23, 2009 at 9:36 AM, Philip R. [email protected]
wrote:
… no
libxml2 is missing. try ‘port install libxml2’ or ‘yum install libxml2’
but when I try and do “yum install libxml2” I get:
yum install libxml2-devel ?
(yay redhat)
Dick,
Dick D. wrote:
checking for #include <libxml/parser.h>
… no
libxml2 is missing. try ‘port install libxml2’ or ‘yum install libxml2’
but when I try and do “yum install libxml2” I get:
yum install libxml2-devel ?
Yep, had to do libxslt-devel as well but looks good - Thanks!
Phil.
(yay redhat)
–
Philip R.
GPO Box 3411
Sydney NSW 2001
Australia
E-mail: [email protected]
On Tue, Mar 24, 2009 at 04:29:47AM +0900, ara.t.howard wrote:
On Mar 22, 2009, at 9:51 PM, Aaron P. wrote:
- Builder methods take a hash as a second argument
can builder generate nodes like yet?
It’s possible, just not easy.
I’m working on it.
On Mar 22, 2009, at 9:51 PM, Aaron P. wrote:
- Builder methods take a hash as a second argument
can builder generate nodes like yet?
a @ http://codeforpeople.com/
On Mar 23, 2009, at 3:42 PM, Aaron P. wrote:
It’s possible, just not easy.
I’m working on it.
have a look at this
http://codeforpeople.rubyforge.org/svn/tagz/trunk/
it avoids completes the method missing issues, allows mixing in, and
even supports conditional escaping for factoring out xml/html partials.
i’m using nokogiri for reading now, but tagz for writing
cheers.
a @ http://codeforpeople.com/