How to map XML attributes

Trans · November 26, 2007, 5:38pm

I’m working on a DSL that maps XML <=> Object, and I’m stuck on where
to store tag attributes. Do attributes belong to the tag or to the
content?

For example:

HTML Content

In mapping this to an object, lets say:

foo.bar #=> “HTML Content”

Does is make sense to ask:

foo.bar.attributes #=> { ‘xmlns’=>…, ‘type’=>… }

or should the attributes be tied to the “attributation” of foo, so:

foo.attributes(:bar) #=> { ‘xmlns’=>…, ‘type’=>… }

The ‘type’ attribute makes me think the first makes the most sense,
but the ‘xmlns’ makes me think the later.

Any insights?

Thanks,
T.

Trans · November 26, 2007, 7:23pm

(Sorry about top posting - Outlook…)

Personally I prefer handling the different pieces of the XML in a
consistent
manner. That is, if you have the XML you provided, plus a little:

HTML Content RSS Content

And I wanted to process everything under , since that tag contains
the
data that I am interested in, I’d want to be able to pass around the
… elements.

<bar xml…
Content

Therefore, attributes would need to be associated with the tag where
they
appear, not for the entire DOM.

So lets say that I (hypothetically) want to write a recursive XML parser
that reads data from XML and uses an activerecord object to insert it
into a
database (This is a fairly common use case, at least for me).

The pseudoRUBYcode for this would be:

def parse(xml)
if !xml.nil?
parse (xml.child_nodes) # parse all XML elements that are my
children
xml.save # Save the XML to the DB
end
end

Since I need to access the type attribute in the xml.save method (so I
can
handle HTML and RSS differently, for example), then I will need to pass
it
along with the XML elements that make up the child nodes.

I like to think of each XML node as a crude Object, thus all data
contained
within that Object should remain with that Object.

Make sense?

Jamie

Trans · November 26, 2007, 9:27pm

On Nov 26, 9:37 am, Trans [email protected] wrote:

foo.attributes(:bar) #=> { ‘xmlns’=>…, ‘type’=>… }

The ‘type’ attribute makes me think the first makes the most sense,
but the ‘xmlns’ makes me think the later.

I don’t share your opinion on xmlns implying the latter. All
attributes are attributes of an element; the first totally makes
sense.

How would you get the attributes of the rootmost element, e.g.:

…

Just attributes( :root )? Bleah, I say.

If I were writing an XML-like DOM, I’d use dot notation (method call)
for the child axis and [] notation for the attributes axis.
foo.bar[:type] #=> …

Trans · November 27, 2007, 1:08am

On Nov 26, 3:25 pm, Phrogz [email protected] wrote:

attributes are attributes of an element; the first totally makes
foo.bar[:type] #=> …
Thanks guys,

I’ve taken your advice(s).

The reason I thought maybe to do it otherwise is b/c of a basic coding
principle --eg. the object referenced by an instance var doesn’t know
the instance var’s name. In mapping XML elements to Ruby objects I
felt like I was violating that principle if I carried the name along
with the element’s body, and subsequently the attributes too.

I think I have a suitable compromise though. I attached the name and
attributes to an “Element” object which delegates via method_missing
to the underlying body object --which itself has no idea what the name
and attributes are.

Make sense?

T.

Trans · November 27, 2007, 1:33am

http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx

This is such a good article.

Any attempt at or>xml mapping could be caught/fraught with issues.
Another boundary is that of the distinction of data & information.
Data is raw. Information is meaningful data.
Meaningful is contextual.
Context is a matter of perception.
Perception is not always reality.
</end.of.suck.eggs.yak>

Not that I have any real answer to this.
Though I have noticed that to some, xml is refreshingly raw.
For instance I understand that some airport baggage handling software
uses xml.
It’s data, it shifts the bag, it expires.

One possible reason to map xml may be to avoid the knarly world of
REXml.
Which is actually pretty good in a martial sort of way.
Avoidance does not mean not using.
http://xml-simple.rubyforge.org/
This wraps around Rex in a Ruby sort of way.
Not ORMing.
Though even when using a wrapper like this, usually there will be a
0-day when one has to go under the hood.

http://www.ibm.com/developerworks/xml/library/x-matters18.html
There is another article in alphaworks, canna find now.
It described a wrapper technique using method_notfound.
There was trouble with the ‘.’ occasionally found in the element name…
(Just read your post Trans).

xmlification has been discontinued?
http://newsattic.com/d/hl/xmlification.html

Advantages & disadvantages:
XML is an adv. when debugging, being simple.
Mapping via the xml may be better than via the DTD as flexibility
could be a adv.
In that when the data structure is not totally defined, the mapping
regenerates a ‘schema’.
In the arena of impedance mis-match, 2 many mode-shifts betwixt disk &
browser can send you to the corner shop.
When moving towards an object design there is a mapping back to get to
a structure which can be delivered to an xml-database.
Going raw to BDBxml is a major performance advantage.

Design can also factor in the cost of debugging.
If this is to be a tool of general purpose, then it has a high-hit
factor.
Would the mapper be suitable if there were to be two storage regimes?
One for development (easy to debug), another for deployment
(performance).

How many frameworks are ultimately involved?
xml implies browser interaction?
xml being manipulated by ECMAscript?

These are of course, questions I’ve been rattling with recently.

Is Ruby too much fun?

MarkT