Forum: Ruby XML parser with file names and line numbers

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
8d16869783573d7ca80a676b65cf98e7?d=identicon&s=25 David Pollak (Guest)
on 2006-04-21 00:01
(Received via mailing list)
Howdy,

Is there a Ruby XML parser that includes the file name and line number
for
elements?

Thanks,

David
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (Guest)
on 2006-04-21 11:42
(Received via mailing list)
2006/4/21, David Pollak <pollak@gmail.com>:
> Howdy,
>
> Is there a Ruby XML parser that includes the file name and line number for
> elements?

What exactly do you mean by that?  AFAIK there is no place to store
this info in DOM so...

robert
F0223b1193ecc3a935ce41a1edd72e42?d=identicon&s=25 zdennis (Guest)
on 2006-04-21 14:22
(Received via mailing list)
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Klemme wrote:
> 2006/4/21, David Pollak <pollak@gmail.com>:
>
>>Howdy,
>>
>>Is there a Ruby XML parser that includes the file name and line number for
>>elements?
>
>
> What exactly do you mean by that?  AFAIK there is no place to store
> this info in DOM so...

It seems like this would be possible with a SAXParser when you're
scanning the document to be able to grab what lineno an element
is on when it starts an element.

Zach
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFESM+EMyx0fW1d8G0RAk7tAJ4m+U76yd9Mrb3XQYR+lQ8HqFaHpwCeNmYp
m1EsIM36/YOm5JHD6Ke1f9E=
=KWbP
-----END PGP SIGNATURE-----
8d16869783573d7ca80a676b65cf98e7?d=identicon&s=25 David Pollak (Guest)
on 2006-04-21 17:22
(Received via mailing list)
It is possible with a SAX parser in Java, but the SAX parser in rexml
does
not include file/line information in the "PullEvent" as far as I can
tell.

I guess rexml is the only XML parser currently under development for
ruby.
F0223b1193ecc3a935ce41a1edd72e42?d=identicon&s=25 zdennis (Guest)
on 2006-04-21 17:31
(Received via mailing list)
David Pollak wrote:
> It is possible with a SAX parser in Java, but the SAX parser in rexml does
> not include file/line information in the "PullEvent" as far as I can tell.
>
> I guess rexml is the only XML parser currently under development for ruby.
>

No, ruby-libxml is in development. I know because I am an active
"talker" on the ruby-libxml mailing list.

   http://rubyforge.org/projects/libxml/

I switched from REXML to libxml because libxml is blazing fast.

Zach
8d16869783573d7ca80a676b65cf98e7?d=identicon&s=25 David Pollak (Guest)
on 2006-04-21 17:47
(Received via mailing list)
Zach,

Yep... libxml does the trick with:
XML::Parser.default_line_numbers = true
...


element.line_number
element.doc.filename

Thanks,

David
E3c79c779c0b390049289cdfe7cb9705?d=identicon&s=25 Bob Hutchison (Guest)
on 2006-04-21 17:59
(Received via mailing list)
On Apr 21, 2006, at 11:21 AM, David Pollak wrote:

> It is possible with a SAX parser in Java, but the SAX parser in
> rexml does
> not include file/line information in the "PullEvent" as far as I
> can tell.
>
> I guess rexml is the only XML parser currently under development
> for ruby.

Well, there is xampl-pp that I wrote. It doesn't change often, that's
true, but I *use* it a lot. The most recent version is bundled with
xampl (see my signature for where). Xampl-pp is a pull parser. And it
keeps track of line and column as best it can (and yes, it can get
the column wrong, especially if UTF is involved (it is sometimes more
of a byte count than a character count), but the line count is
normally pretty good). The instance variable @input in the parser is
a bit of a funny thing, but if it is a file, then you can use the
File methods (e.g. path) and that'll work -- the trouble is that
@input isn't exposed, so... There are two ways in which the pull
parser can be used: as an object that parses and manipulated by
calling methods on it, or, alternatively, by extending the parser
with actions. I use both techniques, sometimes at the same time --
this can be interesting. Anyway, you can either re-open Xampl_PP and
define an accessor for @input, or you can extend and it is right
there for you. I've never exposed a reader to @input because mucking
with it would be a very very bad idea.

Have you looked at the libxml wrapper? It probably provides that
information.

Cheers,
Bob


>>>>
>> scanning
>> =KWbP
>> -----END PGP SIGNATURE-----
>>
>>

----
Bob Hutchison                  -- blogs at <http://www.recursive.ca/
hutch/>
Recursive Design Inc.          -- <http://www.recursive.ca/>
Raconteur                      -- <http://www.raconteur.info/>
xampl for Ruby                 -- <http://rubyforge.org/projects/xampl/>
Ded98dc06a045924f0d48b2e46fdf229?d=identicon&s=25 Henrik Martensson (Guest)
on 2006-04-21 23:36
(Received via mailing list)
On Fri, 2006-04-21 at 00:01, David Pollak wrote:
> Howdy,
>
> Is there a Ruby XML parser that includes the file name and line number for
> elements?

XML does not have the concept of a line. XML deals only with describing
data and structure, not formatting. Carriage returns and line feeds are
considered to be whitespace.

That is why true XML editors use separate style sheets, like CSS,
XSL-FO, or FOSI, to format XML documents.

If you have an XML document, process it in some way, for example just by
parsing it and saving it, any carriage returns and line feeds may have
been removed. The parser may even add new ones. Whitespace is guaranteed
to be preserved in CDATA sections only.

You might find it more useful to count the elements themselves. That way
the numbers won't change just because you open a file in an editor and
look at it.

Elements do not have file names either.

What is it you want to do?

/Henrik

--
http://kallokain.blogspot.com/ - Blogging from the trenches of software
development
http://www.henrikmartensson.org/  - Reflections on software development
http://tocsim.rubyforge.com/ - Process simulation
http://testunitxml.rubyforge.org/  - XML test framework
http://declan.rubyforge.org/ - Declarative XML processing
8d16869783573d7ca80a676b65cf98e7?d=identicon&s=25 David Pollak (Guest)
on 2006-04-22 00:10
(Received via mailing list)
Henrik,

I've already got a solution to the issue.  libxml-ruby has the
functionality
I need.

One of my projects is SiteMap ( http://rubyforge.org/projects/sitemap )
which is a Domain Specific Language that descripts web site navigation,
access control, link names, etc.  SiteMap allows the designer to imbed
Ruby
code (e.g., to test access control, etc.)  It would be nice to have the
file/line of the generated methods in stack traces, etc.

Thanks,

David
This topic is locked and can not be replied to.