Hi guys.
I’m trying to mine some info from wikipedia’s xml dump. The code im
using is as follows:
require ‘rexml/parsers/sax2parser’
xml =
REXML::Parsers:SAX2Parser.new(File.new(‘enwiki-20070402-pages-articles-1.xml’)
)
i = 0
parser.listen(:characters, %w{title text}){|text| puts text}
parser.parse
Which is generating the error:
xmlreap.rb:2: undefined method `new’ for :SAX2Parser:Symbol
(NoMethodError)
Anyone have any experience with SAX2? Or ruby stream parsing in
general? What I’m trying to do is pretty simple: grab the title and
text tag element contents and do something with them.
Thanks,
Krister Johnson