Hi (again, sort of)
I am still on my quest to write a program that parses a large XML file.
After having tried to do it in tree mode, I had to realize that the
performance was simply abysmal. So back to the drawing board. But, and
here is the thingâŚI could find a good straight-forward tutorial on how
to write a stream parser using REXML. The official tutorial is pretty
much mute on that part and the only other example I found (or rather was
pointed to -
http://www.janvereecken.com/2007/4/11/event-driven-xml-parser-in-ruby)
was way too complex for someone like me who is still pretty much a
beginner in ruby.
So, what I am looking for is either a brief description of how to write
an event driven parser or else a link to a good and simple tutorial.
For the former, this is what the parser should do:
Find the element âGene-refâ, allow me to access its children and then
close and repeat for the next "Gene-ref entry. In xml code, that would
look like
I understand that I need a Listener class, like
classListener
def tag_start(name, attrs)
end
def text(text)
end
def tag_end(name)
end
end
But I havent really worked with classes all that much and maybe someone
could just put down the basics for the script from where I can start
experimenting? Would be very much appreciated. Letâs say for each
element âGene-refâ I want to puts the name, start and end in one line,
or something along those lines.
Cheers,
Marc