Design suggestion for translations/mappings from xml


#1

Hi, i’ve got an xml file i’m parsing and creating objects from.

The xml is not named, nor formatted like my objects, so i need to define
a bunch of mappings/translations.

I’m looking for any design patterns or suggestions people might have to
elegantly solve this.

xml example:








All the comforts of home conveniently…

object example:

class Hotel
attr_accessor :property_name, :brand
end

There’s a bunch of other classes and corresponding awkwardly named tags.

It’d be killer if i could do a to_xml as well as from. But not
crucial…

Thanks for your help!


#2

On Thu, Nov 27, 2008 at 7:00 AM, Brian L. removed_email_address@domain.invalid
wrote:

<HotelDescriptiveContent CurrencyCode=“USD” TimeZone=“GMT;-06”

crucial…
Here’s a rough idea I though of when reading your email:

module XMLMap
module ClassMethods
attr_reader :from_xml, :to_xml
def map property, xpath_to_value
(@from_xml ||= {})[xpath_to_value] = property
(@to_xml ||= {})[property] = xpath_to_value
end
end

def to_xml
“implement to_xml creating all tags and attrs defined in the to_xml
hash: #{self.class.to_xml.inspect}”
end

def from_xml xml
“iterate through all xpaths from from_xml
(#{self.class.from_xml.inspect}) initializing the corresponding ivars”
# Alternatively you could make this a class method that
returns an initialized instance
end

def self.included child
puts “inherited”
child.extend ClassMethods
end
end

I haven’t implemented the XML stuff, but it should be straightforward
if you limit yourself to simple xpaths. Then you can use it like:

class Test
include XMLMap
map :this, “/root/this/tag”
map :that, “/root/that/@attr
end

test = Test.new.from_xml
“this_value”
test.to_xml

You could also implement a version of the map method that receives a
hash with all the mappings or whatever.

Hope this helps,

Jesus.


#3

On 2008-11-27, Brian L. removed_email_address@domain.invalid wrote:

Hi, i’ve got an xml file i’m parsing and creating objects from.

The xml is not named, nor formatted like my objects, so i need to define
a bunch of mappings/translations.

I’m looking for any design patterns or suggestions people might have to
elegantly solve this.

Well, if you are going to be parsing lots of different types of XML
document, here are some possible ways you could approach the problem:

  • if a schema document is available, pull it in and do something clever
    with it. If it’s a RelaxNG schema, if you look for oneOrMore and
    zeroOrMore elements that contain element references, you can use that
    to map to lists.

  • you could also write an XSLT transformation to turn the document into
    an intermediary XML format for which there are already intuitive
    interfaces in Ruby (or whatever other language you end up using) -
    RSS/Atom, for instance (maybe RDF eventually - I’m soon to release an
    alpha version of a Ruby RDF gem) or even something like XML-RPC or
    SOAP, both of which are designed to map to native data-types (and
    objects in the case of SOAP).

  • you could see if you could produce some kind of clever metrics from
    the document by looking for the most-used elements in particular
    places in the hierarchy.

  • sometimes a separate parser/serializer class is a necessity - I took
    this approach in my project just because it was the approach that the
    guys who had done the Python and Java libraries had done, and it
    seemed pretty sensible to do it that way.

One day, people will get that XML is a markup language - and markup is
something you add to documents. If you just want to shunt data around,
it probably shouldn’t be the first choice when compared with something
like JSON or YAML.


#4

class Test
include XMLMap
map :this, “/root/this/tag”
map :that, “/root/that/@attr
end

test = Test.new.from_xml
“this_value”
test.to_xml

You could also implement a version of the map method that receives a
hash with all the mappings or whatever.

Hope this helps,

Jesus.

Thanks so much for your response!

The map method was the perfect solution to my problem. Declarative,
organized, and maintainable. Previously, I kept trying to define them
all in one spot - I just needed to think about it differently.

One enhancement that made me smile was a call to attr_accessor in the
map method :slight_smile:


#5

On Thu, Nov 27, 2008 at 6:33 PM, Brian L. removed_email_address@domain.invalid
wrote:

Thanks so much for your response!

Glad it helped.

One enhancement that made me smile was a call to attr_accessor in the
map method :slight_smile:

Now you got me thinking a little step further. Finally I’ve had some
spare time,
and came up with this couple enhancements (plus implementing the XML
stuff:
parsing using nokogiri, generated the XML by hand):

require ‘nokogiri’

module XMLMap
module ClassMethods
def set_xml_data property, xpath
re = %r{\A(/(\w+))+(/@(\w+))?\Z}
raise “Invalid xpath: #{xpath} for attribute #{property}. Only
simple tags and attrs supported (#{re})” unless xpath =~ re
(@mappings ||= {})[property] = xpath
end

def mapped_reader property, xpath
  set_xml_data property, xpath
  self.class_eval {attr_reader property.to_sym}
end

def mapped_writer property, xpath
  set_xml_data property, xpath
  self.class_eval {attr_writer property.to_sym}
end

def mapped_accessor property, xpath
  set_xml_data property, xpath
  self.class_eval {attr_accessor property.to_sym}
end

def from_xml xml
  o = self.new
  doc = Nokogiri.XML(xml)
  @mappings.each do |attr, xpath|
    item = doc.xpath xpath
    unless item.empty?
      o.instance_variable_set "@#{attr}".to_sym, item.inner_text
    end
  end
  o
end

def mappings
  @mappings
end

end

def to_xml
xml = Hash.new {|h,k| h[k] = Hash.new(&h.default_proc)}
self.class.mappings.each do |attr, xpath|
value = instance_variable_get “@#{attr}”
continue unless value
tag, attr = xpath.split("/@")
tags = tag.split("/")
h = tags[1…-2].inject(xml) {|h, tag| h[tag]}
if attr
h[tags[-1]][:attributes][attr] = value
else
h[tags[-1]][:value] = value
end
end
output = “”
xml.each do |node, data|
generate_node node, data, output
end
output
end

def generate_node node,data,output
value = data.delete(:value)
attrs = data.delete(:attributes)
output << “<#{node}”
if attrs
attrs.each do |attr, value|
output << " #{attr}="#{value}""
end
end
if value
output << “>#{value}</#{node}>”
elsif !data.empty?
output << “>”
data.each do |child_tag, child_data|
generate_node child_tag, child_data, output
end
output << “</#{node}>”
else
output << “/>”
end
end

def self.included child
child.extend ClassMethods
end
end

The XML stuff ended up a little messy, I’d appreciate any help or
comment there. Usage:

class A
include XMLMap
mapped_reader :first, “/root/first”
mapped_reader :second, “/root/second”
mapped_accessor :attr, “/root/first/@attr
end

a = A.from_xml %q{the_first_valuethe_second_value}
p a
a.attr = “changed value”
puts a.to_xml

I haven’t tested very thoroughly, so there might be a bug or two in
there. Probably you will want to reimplement the to_xml method to use
a proper XML generator.
Any comment about the code or approach appreciated, for sure there’s
room for improvement :slight_smile:

Regards,

Jesus.


#6

I haven’t tested very thoroughly, so there might be a bug or two in
there. Probably you will want to reimplement the to_xml method to use
a proper XML generator.
Any comment about the code or approach appreciated, for sure there’s
room for improvement :slight_smile:

Regards,

Jesus.

Wow, that’s pretty kickass.

I love this.

xml = Hash.new {|h,k| h[k] = Hash.new(&h.default_proc)}
tags[1…-2].inject(xml) {|h, tag| h[tag]}

I actually did some of the same stuff last week, but a lot less generic
and sophisticated. I hadn’t realized the potential of having a cool
little library for this stuff.

You should release it - I would have used this for sure if I’d found it
on github.