Hey everyone! Have you finished your holiday shopping yet? I know I
haven’t.
Fortunately for you guys, Mike and I like programming a lot more than
shopping. I mean, don’t get me wrong. I love shopping for myself, I
just find shopping for other people to be, well, difficult.
Anyway, let’s get down to business:
nokogiri version 1.4.1 has been released!
- http://nokogiri.org
- http://github.com/tenderlove/nokogiri/wikis
- http://github.com/tenderlove/nokogiri/tree/master
- http://groups.google.com/group/nokogiri-talk
- http://github.com/tenderlove/nokogiri/issues
Nokogiri (鋸) is an HTML, XML, SAX, and Reader parser. Among Nokogiri’s
many features is the ability to search documents via XPath or CSS3
selectors.
XML is like violence - if it doesn’t solve your problems, you are not
using
enough of it.
Changes:
1.4.1 / 2009/12/10
-
New Features
- Added Nokogiri::LIBXML_ICONV_ENABLED
- Alias Node#[] to Node#attr
- XML::Node#next_element added
- XML::Node#> added for searching a nodes immediate children
- XML::NodeSet#reverse added
- Added fragment support to Node#add_child, Node#add_next_sibling,
Node#add_previous_sibling, and Node#replace. - XML::Node#previous_element implemented
- Rubinius support
- Ths CSS selector engine now supports :has()
- XML::NodeSet#filter() was added
- XML::Node.next= and .previous= are aliases for add_next_sibling and
add_previous_sibling. GH #183
-
Bugfixes
- XML fragments with namespaces do not raise an exception
(regression in 1.4.0) - Node#matches? works in nodes contained by a DocumentFragment. GH
#158 - Document should not define add_namespace() method. GH #169
- XPath queries returning namespace declarations do not segfault.
- Node#replace works with nodes from different documents. GH #162
- Adding XML::Document#collect_namespaces
- Fixed bugs in the SOAP4R adapter
- Fixed bug in XML::Node#next_element for certain edge cases
- Fixed load path issue with JRuby under Windows. GH #160.
- XSLT#apply_to will honor the “output method”. Thanks richardlehane!
- Fragments containing leading text nodes with newlines now parse
properly.
GH #178.
- XML fragments with namespaces do not raise an exception
FEATURES:
- XPath support for document searching
- CSS3 selector support for document searching
- XML/HTML builder
Nokogiri parses and searches XML/HTML very quickly, and also has
correctly implemented CSS3 selector support as well as XPath support.
Here is a speed test:
SUPPORT:
The Nokogiri {mailing
list}[http://groups.google.com/group/nokogiri-talk]
is available here:
The {bug tracker}[http://github.com/tenderlove/nokogiri/issues]
is available here:
The IRC channel is #nokogiri on freenode.
SYNOPSIS:
require ‘nokogiri’
require ‘open-uri’
Get a Nokogiri::HTML:Document for the page we’re interested in…
doc =
Nokogiri::HTML(open(‘tenderlove - Google Search’))
Do funky things with it using Nokogiri::XML::Node methods…
Search for nodes by css
doc.css(‘h3.r a.l’).each do |link|
puts link.content
end
Search for nodes by xpath
doc.xpath(‘//h3/a[@class=“l”]’).each do |link|
puts link.content
end
Or mix and match.
doc.search(‘h3.r a.l’, ‘//h3/a[@class=“l”]’).each do |link|
puts link.content
end
REQUIREMENTS:
- ruby 1.8 or 1.9
- libxml2
- libxml2-dev
- libxslt
- libxslt-dev
INSTALL:
- sudo gem install nokogiri