Forum: Ruby Design suggestion for translations/mappings from xml

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
A3e009f66da76caebaadc3091f076cbf?d=identicon&s=25 Brian Lonsdorf (drboolean)
on 2008-11-27 07:04
Hi, i've got an xml file i'm parsing and creating objects from.

The xml is not named, nor formatted like my objects, so i need to define
a bunch of mappings/translations.

I'm looking for any design patterns or suggestions people might have to
elegantly solve this.

xml example:

<HotelDescriptiveContent CurrencyCode="USD" TimeZone="GMT;-06"
BrandCode="BV" HotelCode="517" HotelName="Americas Best Value Inn and
Suites - Downtown" Overwrite="true" UnitOfMeasureCode="1">
<HotelInfo WhenBuilt="2000" HotelStatus="Bookable" HotelStatusCode="1">
<CategoryCodes>
  <LocationCategory Code="3"></LocationCategory>
  <SegmentCategory Code="5"></SegmentCategory>
  <HotelCategory Code="20"></HotelCategory>
</CategoryCodes>
<Description>All the comforts of home conveniently...</Description>


object example:

class Hotel
  attr_accessor :property_name, :brand
end


There's a bunch of other classes and corresponding awkwardly named tags.

It'd be killer if i could do a to_xml as well as from.  But not
crucial...

Thanks for your help!
E088bb5c80fd3c4fd02c2020cdacbaf0?d=identicon&s=25 Jesús Gabriel y Galán (Guest)
on 2008-11-27 10:21
(Received via mailing list)
On Thu, Nov 27, 2008 at 7:00 AM, Brian Lonsdorf <brian@trnsfr.com>
wrote:
> <HotelDescriptiveContent CurrencyCode="USD" TimeZone="GMT;-06"
>
> crucial...
Here's a rough idea I though of when reading your email:

module XMLMap
  module ClassMethods
    attr_reader :from_xml, :to_xml
    def map property, xpath_to_value
      (@from_xml ||= {})[xpath_to_value] = property
      (@to_xml ||= {})[property] = xpath_to_value
    end
  end

  def to_xml
    "implement to_xml creating all tags and attrs defined in the to_xml
hash: #{self.class.to_xml.inspect}"
  end

  def from_xml xml
    "iterate through all xpaths from from_xml
(#{self.class.from_xml.inspect}) initializing the corresponding ivars"
               # Alternatively you could make this a class method that
returns an initialized instance
  end

  def self.included child
    puts "inherited"
    child.extend ClassMethods
  end
end

I haven't implemented the XML stuff, but it should be straightforward
if you limit yourself to simple xpaths. Then you can use it like:

class Test
include XMLMap
map :this, "/root/this/tag"
map :that, "/root/that/@attr"
end

test = Test.new.from_xml
"<root><this><tag>this_value</tag></this><that attr="that_value"/>"
test.to_xml

You could also implement a version of the map method that receives a
hash with all the mappings or whatever.

Hope this helps,

Jesus.
3e00403e6a08fc4499057d8be5b85709?d=identicon&s=25 Tom Morris (Guest)
on 2008-11-27 18:20
(Received via mailing list)
On 2008-11-27, Brian Lonsdorf <brian@trnsfr.com> wrote:
> Hi, i've got an xml file i'm parsing and creating objects from.
>
> The xml is not named, nor formatted like my objects, so i need to define
> a bunch of mappings/translations.
>
> I'm looking for any design patterns or suggestions people might have to
> elegantly solve this.
>

Well, if you are going to be parsing lots of different types of XML
document, here are some possible ways you could approach the problem:

- if a schema document is available, pull it in and do something clever
  with it. If it's a RelaxNG schema, if you look for oneOrMore and
  zeroOrMore elements that contain element references, you can use that
  to map to lists.

- you could also write an XSLT transformation to turn the document into
  an intermediary XML format for which there are already intuitive
  interfaces in Ruby (or whatever other language you end up using) -
  RSS/Atom, for instance (maybe RDF eventually - I'm soon to release an
  alpha version of a Ruby RDF gem) or even something like XML-RPC or
  SOAP, both of which are designed to map to native data-types (and
  objects in the case of SOAP).

- you could see if you could produce some kind of clever metrics from
  the document by looking for the most-used elements in particular
  places in the hierarchy.

- sometimes a separate parser/serializer class is a necessity - I took
  this approach in my project just because it was the approach that the
  guys who had done the Python and Java libraries had done, and it
  seemed pretty sensible to do it that way.

One day, people will get that XML is a *markup* language - and markup is
something you add to *documents*. If you just want to shunt data around,
it probably shouldn't be the first choice when compared with something
like JSON or YAML.
A3e009f66da76caebaadc3091f076cbf?d=identicon&s=25 Brian Lonsdorf (drboolean)
on 2008-11-27 18:38
> class Test
> include XMLMap
> map :this, "/root/this/tag"
> map :that, "/root/that/@attr"
> end
>
> test = Test.new.from_xml
> "<root><this><tag>this_value</tag></this><that attr="that_value"/>"
> test.to_xml
>
> You could also implement a version of the map method that receives a
> hash with all the mappings or whatever.
>
> Hope this helps,
>
> Jesus.

Thanks so much for your response!

The map method was the perfect solution to my problem.  Declarative,
organized, and maintainable.  Previously, I kept trying to define them
all in one spot - I just needed to think about it differently.

One enhancement that made me smile was a call to attr_accessor in the
map method :)
E088bb5c80fd3c4fd02c2020cdacbaf0?d=identicon&s=25 Jesús Gabriel y Galán (Guest)
on 2008-12-01 19:09
(Received via mailing list)
On Thu, Nov 27, 2008 at 6:33 PM, Brian Lonsdorf <brian@trnsfr.com>
wrote:
> Thanks so much for your response!

Glad it helped.

> One enhancement that made me smile was a call to attr_accessor in the
> map method :)

Now *you* got me thinking a little step further. Finally I've had some
spare time,
and came up with this couple enhancements (plus implementing the XML
stuff:
parsing using nokogiri, generated the XML by hand):

require 'nokogiri'

module XMLMap
  module ClassMethods
    def set_xml_data property, xpath
      re = %r{\A(/(\w+))+(/@(\w+))?\Z}
      raise "Invalid xpath: #{xpath} for attribute #{property}. Only
simple tags and attrs supported (#{re})" unless xpath =~ re
      (@mappings ||= {})[property] = xpath
    end

    def mapped_reader property, xpath
      set_xml_data property, xpath
      self.class_eval {attr_reader property.to_sym}
    end

    def mapped_writer property, xpath
      set_xml_data property, xpath
      self.class_eval {attr_writer property.to_sym}
    end

    def mapped_accessor property, xpath
      set_xml_data property, xpath
      self.class_eval {attr_accessor property.to_sym}
    end

    def from_xml xml
      o = self.new
      doc = Nokogiri.XML(xml)
      @mappings.each do |attr, xpath|
        item = doc.xpath xpath
        unless item.empty?
          o.instance_variable_set "@#{attr}".to_sym, item.inner_text
        end
      end
      o
    end

    def mappings
      @mappings
    end
  end

  def to_xml
    xml = Hash.new {|h,k| h[k] = Hash.new(&h.default_proc)}
    self.class.mappings.each do |attr, xpath|
      value = instance_variable_get "@#{attr}"
      continue unless value
      tag, attr = xpath.split("/@")
      tags = tag.split("/")
      h = tags[1..-2].inject(xml) {|h, tag| h[tag]}
      if attr
        h[tags[-1]][:attributes][attr] = value
      else
        h[tags[-1]][:value] = value
      end
    end
    output = ""
    xml.each do |node, data|
      generate_node node, data, output
    end
    output
  end

  def generate_node node,data,output
    value = data.delete(:value)
    attrs = data.delete(:attributes)
    output << "<#{node}"
    if attrs
      attrs.each do |attr, value|
        output << " #{attr}=\"#{value}\""
      end
    end
    if value
      output << ">#{value}</#{node}>"
    elsif !data.empty?
      output << ">"
      data.each do |child_tag, child_data|
        generate_node child_tag, child_data, output
      end
      output << "</#{node}>"
    else
      output << "/>"
    end
  end

  def self.included child
    child.extend ClassMethods
  end
end

The XML stuff ended up a little messy, I'd appreciate any help or
comment there. Usage:

class A
  include XMLMap
  mapped_reader :first, "/root/first"
  mapped_reader :second, "/root/second"
  mapped_accessor :attr, "/root/first/@attr"
end

a = A.from_xml %q{<root><first
attr="the_attr_value">the_first_value</first><second>the_second_value</second></root>}
p a
a.attr = "changed value"
puts a.to_xml

I haven't tested very thoroughly, so there might be a bug or two in
there. Probably you will want to reimplement the to_xml method to use
a proper XML generator.
Any comment about the code or approach appreciated, for sure there's
room for improvement :-)

Regards,

Jesus.
A3e009f66da76caebaadc3091f076cbf?d=identicon&s=25 Brian Lonsdorf (drboolean)
on 2008-12-02 02:19
> I haven't tested very thoroughly, so there might be a bug or two in
> there. Probably you will want to reimplement the to_xml method to use
> a proper XML generator.
> Any comment about the code or approach appreciated, for sure there's
> room for improvement :-)
>
> Regards,
>
> Jesus.

Wow, that's pretty kickass.

# I love this.
xml = Hash.new {|h,k| h[k] = Hash.new(&h.default_proc)}
tags[1..-2].inject(xml) {|h, tag| h[tag]}

I actually did some of the same stuff last week, but a lot less generic
and sophisticated.  I hadn't realized the potential of having a cool
little library for this stuff.

You should release it - I would have used this for sure if I'd found it
on github.
This topic is locked and can not be replied to.