Nokogiri extract text?

there is a simple file /home/pt/test.html such as the following

  

hallo,world

 

 

i want to extract the text “hallo,world” in the /home/pt/test.html with
nokogiri,how to write?

require ‘rubygems’
require ‘nokogiri’
html = ‘/home/pt/test.html’
doc = Nokogiri::HTML(html)

would you mind to finish it ?

On Wed, Jun 23, 2010 at 12:23 AM, Pen T. [email protected] wrote:

require ‘rubygems’
require ‘nokogiri’
html = ‘/home/pt/test.html’
doc = Nokogiri::HTML(html)

would you mind to finish it ?

At http://wiki.github.com/tenderlove/nokogiri/ you can read on how to
find the nodes you need. I think you’ll need to use xpath.

Bye

If you just want to extract some specific text within a specific tag you
should go with what Luis posted.

If you want to extract the whole plain text from a specific area in your
document, not knowing which tags may occur, you can try this:
http://www.nils-haldenwang.de/frameworks-and-tools/nokogiri/how-to-extract-plain-text-from-html-with-nokogiri

I do it like this:

puts doc.search(‘p’).map { |e| e.text }

Pen T. wrote in post #920908:

there is a simple file /home/pt/test.html such as the following

  

hallo,world

 

 

i want to extract the text “hallo,world” in the /home/pt/test.html with
nokogiri,how to write?

require ‘rubygems’
require ‘nokogiri’
html = ‘/home/pt/test.html’
doc = Nokogiri::HTML(html)

would you mind to finish it ?