Nokogiri 1.4.0 Released

Hi everyone! One year ago, I released the first version of nokogiri.
With the help of my extremely handsome partner in crime, Mike D.,
we present to you the latest and greatest of our fine codes.

While you’re downloading the new gem, you should check out Nokogiri’s
new website at:

If you like the cut of that website’s jib, you should know it was
designed by one “Shane Becker”. Shane was wrongly convicted of an ATM
robbery and broke out of jail. If you can find him, you can hire him.
(Actually, I know where to find him).

nokogiri version 1.4.0 has been released!

Nokogiri (鋸) is an HTML, XML, SAX, and Reader parser. Among Nokogiri’s
many features is the ability to search documents via XPath or CSS3

XML is like violence - if it doesn’t solve your problems, you are not
using enough of it.


1.4.0 / 2009/10/30

  • Happy Birthday!

  • New Features

    • Node#at_xpath returns the first element of the NodeSet matching the
    • Node#at_css returns the first element of the NodeSet matching the
    • NodeSet#| for unions GH #119 (Thanks Serabe!)
    • NodeSet#inspect makes prettier output
    • Node#inspect implemented for more rubyish document inspecting
    • Added XML::DTD#external_id
    • Added XML::DTD#system_id
    • Added XML::ElementContent for DTD Element content validity
    • Better namespace declaration support in Nokogiri::XML::Builder
    • Added XML::Node#external_subset
    • Added XML::Node#create_external_subset
    • Added XML::Node#create_internal_subset
    • XML Builder can append raw strings (GH #141, patch from dudleyf)
    • XML::SAX::ParserContext added
    • XML::Document#remove_namespaces! for the namespace-impaired
  • Bugfixes

    • returns nil when HTML documents do not declare a meta encoding tag.
      GH #115
    • Uses RbConfig::CONFIG[‘host_os’] to adjust ENV[‘PATH’] GH #113
    • NodeSet#search is more efficient GH #119 (Thanks Serabe!)
    • NodeSet#xpath handles custom xpath functions
    • Fixing a SEGV when XML::Reader gets attributes for current node
    • Node#inner_html takes the same arguments as Node#to_html GH #117
    • DocumentFragment#css delegates to it’s child nodes GH #123
    • NodeSet#[] works with slices larger than NodeSet#length GH #131
    • Reparented nodes maintain their namespace. GH #134
    • Fixed SEGV when adding an XML::Document to NodeSet
    • XML::SyntaxError can be duplicated. GH #148
  • Deprecations

    • Hpricot compatibility layer removed


  • XPath support for document searching
  • CSS3 selector support for document searching
  • XML/HTML builder

Nokogiri parses and searches XML/HTML very quickly, and also has
correctly implemented CSS3 selector support as well as XPath support.

Here is a speed test:


The Nokogiri {mailing
is available here:

The bug tracker is available here:

The IRC channel is #nokogiri on freenode.


require ‘nokogiri’
require ‘open-uri’

Get a Nokogiri::HTML:Document for the page we’re interested in…

doc =
Nokogiri::HTML(open(‘tenderlove - Google Search’))

Do funky things with it using Nokogiri::XML::Node methods…

Search for nodes by css

doc.css(‘h3.r a.l’).each do |link|
puts link.content

Search for nodes by xpath

doc.xpath(‘//h3/a[@class=“l”]’).each do |link|
puts link.content

Or mix and match.‘h3.r a.l’, ‘//h3/a[@class=“l”]’).each do |link|
puts link.content


  • ruby 1.8 or 1.9
  • libxml2
  • libxml2-dev
  • libxslt
  • libxslt-dev


  • sudo gem install nokogiri

Great work, again! :wink: I’m a bit sad to see the hpricot compatiability
layer removed but oh well. Changing times.

Thanks for your work!