Hi (again, sort of)
I am still on my quest to write a program that parses a large XML file.
After having tried to do it in tree mode, I had to realize that the
performance was simply abysmal. So back to the drawing board. But, and
here is the thing…I could find a good straight-forward tutorial on how
to write a stream parser using REXML. The official tutorial is pretty
much mute on that part and the only other example I found (or rather was
pointed to -
was way too complex for someone like me who is still pretty much a
beginner in ruby.
So, what I am looking for is either a brief description of how to write
an event driven parser or else a link to a good and simple tutorial.
For the former, this is what the parser should do:
Find the element “Gene-ref”, allow me to access its children and then
close and repeat for the next "Gene-ref entry. In xml code, that would
I understand that I need a Listener class, like
def tag_start(name, attrs)
But I havent really worked with classes all that much and maybe someone
could just put down the basics for the script from where I can start
experimenting? Would be very much appreciated. Let’s say for each
element “Gene-ref” I want to puts the name, start and end in one line,
or something along those lines.