Forum: Ruby Parsing XML into a complete domain object

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
0ec3bf94bce07e41ebe0df6c7edb9521?d=identicon&s=25 Brian Cowdery (bcowdery)
on 2006-05-19 23:06
Recently at work we've decided to attempt to build a basic XML driven
automation framework to work with Watir (a web development testing
library for ruby).

I cant figure out how to loop through each level of the REXML document
to extract the data needed to build the complete object.

It seems that if i try to use any iterators on a root.elements[] object
it converts it to text so i can't nest another iterator or loop to
access the innards.

my only recourse has been to resort to a ton of nested while loops which
is ugly when compared to most other ruby loops.

eg.
i = 1
while root.elements['cases'].elements[i] != nil

 n = 1
 while root.elements['cases'].elements['test-case'] != nil

   #more loops here. continue down the chain until i can build the
   #object from the inside out.

 i = i+i
 end

i = i+i
end



my xml looks like this
<script>
<project-name></project-name>
<start-url></start-url>

 <test-case id="1">
  <test-step id="1">
   <test>
    <interaction>double click</interaction>
    <element>
     <name>button 1</type>
     <type>button</type>
    <element>
   </test>
   <check>
    <element>
     <name>Page title</type>
     <type>text</type>
    <element>
   </check>
  </test-step>

  <test-step id="2">
    ...
  </test-step>
 </test-case>

 <test-case id="2">
  ...
 </test-case>
</script>

somehow i have to get THAT modeled into an object
script object contains test-cases, test-cases contain test-steps etc....

Any thoughts? (sorry for the long post... its kinda hard to explain
without showing EVERYTHING.
0817571d150afead454f4220007042fe?d=identicon&s=25 Matthew Desmarais (Guest)
on 2006-05-20 01:40
(Received via mailing list)
Brian Cowdery wrote:
>
>    #more loops here. continue down the chain until i can build the
> my xml looks like this
>      <type>button</type>
>   <test-step id="2">
> script object contains test-cases, test-cases contain test-steps etc....
>
> Any thoughts? (sorry for the long post... its kinda hard to explain
> without showing EVERYTHING.
>
Hi Brian,

Have you given any thought to using YAML instead of  XML?

If you're comfortable with a data format that's a little less
self-descriptive than XML, you may find that YAML's ease of use could
work for you.  It's pretty nice to load up your YAML and have all of
your Ruby objects pieced together for you.  I can send you a small
example if you'd like.

Regards,
Matthew
Bc6d88907ce09158581fbb9b469a35a3?d=identicon&s=25 James Britt (Guest)
on 2006-05-20 05:16
(Received via mailing list)
Matthew Desmarais wrote:
> Brian Cowdery wrote:
>
>> Recently at work we've decided to attempt to build a basic XML driven
>> automation framework to work with Watir (a web development testing
>> library for ruby).
>>
>> I cant figure out how to loop through each level of the REXML document
>> to extract the data needed to build the complete object.

Have you looked at REXML's pull parser?

>> ...

> ...
> Have you given any thought to using YAML instead of  XML?
>

Why not not just use Ruby to describe the data?




--
James Britt

http://www.ruby-doc.org       - Ruby Help & Documentation
http://www.artima.com/rubycs/ - The Journal By & For Rubyists
http://www.rubystuff.com      - The Ruby Store for Ruby Stuff
http://www.30secondrule.com   - Building Better Tools
0817571d150afead454f4220007042fe?d=identicon&s=25 Matthew Desmarais (Guest)
on 2006-05-20 06:42
(Received via mailing list)
On Sat, 2006-05-20 at 12:14 +0900, James Britt wrote:
> Matthew Desmarais wrote:
> > Have you given any thought to using YAML instead of  XML?
> >
> Why not not just use Ruby to describe the data?

YAML buys you a small amount of language independence.  I've chosen YAML
before because I like how well it plays with Ruby.  I've been _able_ to
choose YAML because of how well it plays with other languages.
Bc6d88907ce09158581fbb9b469a35a3?d=identicon&s=25 James Britt (Guest)
on 2006-05-20 07:21
(Received via mailing list)
Matthew Desmarais wrote:
> YAML buys you a small amount of language independence.  I've chosen YAML
> before because I like how well it plays with Ruby.  I've been _able_ to
> choose YAML because of how well it plays with other languages.
>

Perhaps, though more and more I run into YAML files with custom
object-specific serializations (e.g. the YAML files used in Ruby's ri
system); XML tends to do better on that count, with far less coupling of
data and types.

Still, if one is using WATIR, then I suspect that cross-language
configuration is not an concern.  (And if becomes a requirement, then
the Ruby used to defined the tests can be exported as XML or YAML or
whatever works best.)


--
James Britt

http://www.ruby-doc.org       - Ruby Help & Documentation
http://www.artima.com/rubycs/ - The Journal By & For Rubyists
http://www.rubystuff.com      - The Ruby Store for Ruby Stuff
http://www.30secondrule.com   - Building Better Tools
2c562374d9df94b5528a33eca627778f?d=identicon&s=25 Rob Burrowes (Guest)
on 2006-05-20 11:07
(Received via mailing list)
If you just want to walk the tree from any entry point, through all
the sub-levels, you can use the standard each_recurse method.

		#Recurse end to end, printing the tags
		@doc.elements.each("definitions/src") do |element|
			print "<", element.name.to_s, ">"
			element.each_recursive do |childElement|
				print "<", childElement.name.to_s, ">"
			end
		end

If you just want the next level of children, but no deeper, I'm not
sure what you call.  I did this when I played with REXML, and the
obvious each_child doesn't give you an REXML::Element. It gives a
REXML::Text element at the first iteration, then the next
REXML::Element, then another REXML::Text object, etc. Not quite what
you want. But adding this to your code will work.

module REXML
	# Visit all children of  this node, but don't recurse to their children
	def each_child_element(&block)
		self.elements.each {|node|
			block.call(node)
		}
	end
end

It probably  exists in some form in the REXML module, but I can't
find it, so I recreated it (by a little hacking of the modules
each_recurse).

You can then
		#printing the tags of the immediate children.
		@doc = Document.new(File.new(format_file))
		@doc.elements.each("definitions") do |element|
			element.each_child_element do |childElement|
				print "<", childElement.name.to_s, ">"
			end
		end

or recursively walk the tree by calling each_child_element for each
returned childElement (as with the first example)

	def recurse(the_element)
			the_element.each_child_element do |childElement|
				print "<", childElement.name.to_s, ">"
				recurse(childElement)
			end
	end

	@doc.elements.each("definitions/src") do |element|
		recurse(element)
	end
A9b6a93b860020caf9d2d1d58c32478f?d=identicon&s=25 Ross Bamford (Guest)
on 2006-05-20 13:36
(Received via mailing list)
On Sat, 2006-05-20 at 06:06 +0900, Brian Cowdery wrote:
>      <name>button 1</type>
>
> somehow i have to get THAT modeled into an object
> script object contains test-cases, test-cases contain test-steps etc....

Looks like an ideal DigestR[1] opportunity, if you're able to get
Libxml-ruby installed too(*):

#!/usr/local/bin/ruby
require 'xml/digestr'
require 'pp'

class Script
  attr_accessor :name, :starturl, :testcases
  def initialize; @testcases = []; end
end

class TestCase
  attr_accessor :id, :steps
  def initialize; @steps = []; end
end

class TestStep
  attr_accessor :id, :tests, :checks
  def initialize; @tests, @checks = [], []; end
end

class Check
  attr_accessor :elements
  def initialize; @elements = []; end
end

class Test < Check
  attr_accessor :interaction
end

class Element
  attr_accessor :name, :type
end

d = XML::Digester.new(true)
d.add_object_create('/script', Script)

d.add_call_method('/script/project-name', :name=)
d.add_call_param('/script/project-name')
d.add_call_method('/script/start-url', :starturl=)
d.add_call_param('/script/start-url')

d.add_object_create('/script/test-case', TestCase)
d.add_set_properties('/script/test-case')
d.add_link('/script/test-case') { |sc,tc| sc.testcases << tc }

d.add_object_create('/script/test-case/test-step', TestStep)
d.add_set_properties('/script/test-case/test-step')
d.add_link('/script/test-case/test-step') { |tc,ts| tc.steps << ts }

d.add_object_create('/script/test-case/test-step/test', Test)
d.add_link('/script/test-case/test-step/test') { |ts, t| ts.tests << t }
d.add_call_method('/script/test-case/test-step/test/interaction',
:interaction=)
d.add_call_param('/script/test-case/test-step/test/interaction')

d.add_object_create('/script/test-case/test-step/check', Check)
d.add_link('/script/test-case/test-step/test') { |ts, t| ts.checks << t
}

d.add_object_create('*/element', Element)
d.add_link('*/element') { |p, ele| p.elements << ele }

d.add_call_method('*/element/name', :name=)
d.add_call_param('*/element/name')
d.add_call_method('*/element/type', :type=)
d.add_call_param('*/element/type')

script = d.parse_file('watir.xml')

pp script
__END__

This outputs (with the data you posted, with some mismatched close tags
fixed up):

#<Script:0xb7e8d64c
 @name="My Project",
 @starturl="http://localhost:3000/",
 @testcases=
  [#<TestCase:0xb7edcaec
    @id="1",
    @steps=
     [#<TestStep:0xb7edae90
       @checks=
        [#<Test:0xb7ed9978
          @elements=[#<Element:0xb7ed7dbc @name="button 1",
@type="button">],
          @interaction="double click">],
       @id="1",
       @tests=
        [#<Test:0xb7ed9978
          @elements=[#<Element:0xb7ed7dbc @name="button 1",
@type="button">],
          @interaction="double click">]>,
      #<TestStep:0xb7e63ff0 @checks=[], @id="2", @tests=[]>]>,
   #<TestCase:0xb7e63a14 @id="2", @steps=[]>]>

Which I think is what you're after?

[1]: http://digestr.rubyforge.org/
(*): If you can't/won't install native extensions, DigestR's API is
intended to be mostly compatible with an older, REXML-based (IIRC)
digester at http://rubyforge.org/projects/xmldigester
A9b6a93b860020caf9d2d1d58c32478f?d=identicon&s=25 Ross Bamford (Guest)
on 2006-05-20 14:57
(Received via mailing list)
Oops, small bugfix:

On Sat, 2006-05-20 at 20:35 +0900, I wrote:
> d.add_call_method('/script/test-case/test-step/test/interaction', :interaction=)
> d.add_call_param('/script/test-case/test-step/test/interaction')
>
> d.add_object_create('/script/test-case/test-step/check', Check)
- d.add_link('/script/test-case/test-step/test') { |ts, t| ts.checks <<
t }
+ d.add_link('/script/test-case/test-step/check') { |ts, t| ts.checks <<
t }
4299e35bacef054df40583da2d51edea?d=identicon&s=25 James Gray (bbazzarrakk)
on 2006-05-21 02:28
(Received via mailing list)
On May 20, 2006, at 12:19 AM, James Britt wrote:

>> before because I like how well it plays with Ruby.  I've been
>> _able_ to
>> choose YAML because of how well it plays with other languages.
>
> Perhaps, though more and more I run into YAML files with custom
> object-specific serializations (e.g. the YAML files used in Ruby's
> ri system); XML tends to do better on that count, with far less
> coupling of data and types.

Right, and if it's something I need to hand edit, I find my brain can
remember XML syntax easier than YAML's myriad of choices.

James Edward Gray II
Ded98dc06a045924f0d48b2e46fdf229?d=identicon&s=25 Henrik Martensson (Guest)
on 2006-05-24 08:14
(Received via mailing list)
On Fri, 2006-05-19 at 23:06, Brian Cowdery wrote:
> Recently at work we've decided to attempt to build a basic XML driven
> automation framework to work with Watir (a web development testing
> library for ruby).
>
> I cant figure out how to loop through each level of the REXML document
> to extract the data needed to build the complete object.

You might find a treewalker useful:

module XmlUtil

  class TreeWalker

    def initialize(strategy)
      @strategy = strategy
    end

    def walk(node)
      @strategy.execute_before(node) if @strategy.respond_to?
:execute_before
      if node.instance_of?(REXML::Document)
        walk(node.root)
      elsif node.instance_of?(REXML::Element) then
        node.children.each { |child|
          walk(child)
        }
      end
      @strategy.execute_after(node) if @strategy.respond_to?
:execute_after
    end
  end
end

The treewalker will walk the XML document, calling the execute_before
and execute_after methods of a strategy object.

You also need a strategy object. The strategy object looks something
like this:

class MyStrategy

  def execute_before(node)
    # Process start tags
    case node
    when REXML::Document :
      # Do nothing with Document nodes.
      # Necessary because Document inherits Element
    when REXML::Element :
      # Do something with the element
    end
  end

  def execute_after(node)
    # Process end tags
  end
end

If the treewalker does not suit your needs, a node iterator (Java Xerces
style) might do the trick. Let me know if you need one. I've got working
code, but the implementation could be more elegant. (One of my first
Ruby classes.)


/Henrik

--
http://kallokain.blogspot.com/ - Blogging from the trenches of software
development
http://www.henrikmartensson.org/  - Reflections on software development
http://tocsim.rubyforge.com/ - Process simulation
http://testunitxml.rubyforge.org/  - XML test framework
http://declan.rubyforge.org/ - Declarative XML processing
This topic is locked and can not be replied to.