Comparing xml

ahoward · March 23, 2009, 8:09pm

this is a rough start

gist.github.com

https://gist.github.com/83721

gistfile1.rb

# comparing xml is always a b-i-a-t-c-h in any testing environment.  here is a
# little snippet for ruby that, i think, it a good first pass at making it
# easier.  comment with your improvements please!
#
#
  require 'rubygems'
  require 'xmlsimple'

  def xml_cmp a, b
    eq_all_but_zero = Object.new.instance_eval do

This file has been truncated. show original

care to improve?

kind regards.

a @ http://codeforpeople.com/

ahoward · March 23, 2009, 8:25pm

ara.t.howard wrote:

this is a rough start

83721’s gists · GitHub

care to improve?

Use LibXML-Ruby, Nokogiri, or REXML, read both documents, and convert
them to DOMs.

Recursively compare each node, and all its children, to the matching
node in the
other document, and fault if anythings out of tolerance.

ahoward · March 23, 2009, 8:31pm

On Mar 23, 2009, at 1:22 PM, Phlip wrote:

err - that is precisely what that code is doing?

a @ http://codeforpeople.com/

ahoward · March 23, 2009, 10:46pm

ara.t.howard wrote:

Recursively compare each node, and all its children, to the matching
node in the other document, and fault if anythings out of tolerance.

err - that is precisely what that code is doing?

I should have explored that. Isn’t the code simply printing out both
XMLs, with
consistent blanks and indenting, and then comparing their strings for
pure equality?

If so, would that break over details like attributes out of order?

ahoward · March 23, 2009, 11:46pm

On Mar 23, 2009, at 3:42 PM, Phlip wrote:

I should have explored that. Isn’t the code simply printing out both
XMLs, with consistent blanks and indenting, and then comparing their
strings for pure equality?

it is comparing strings, but strings built up inside rexml using the
approach you outlined.

If so, would that break over details like attributes out of order?

ah - good catch - i’ll check on that. my alternate approach,
comparing xmlsimple data structures will not, i believe, suffer from
that, but i wanted to avoid a dependancy.

i’ll check and report back

cheers.

a @ http://codeforpeople.com/

ahoward · March 24, 2009, 12:17am

On Mar 23, 2009, at 5:07 PM, Phlip wrote:

I caught it because I just recently solved a subset of your problem.
assert_xhtml uses Nokogiri to match a subset of HTML within a page.
The code is too weird for you to use, but I indeed had to defeat all
the issues you will encounter!

next version is up

83721’s gists · GitHub

a @ http://codeforpeople.com/

ahoward · March 24, 2009, 2:09am

ara.t.howard wrote:

On Mar 23, 2009, at 5:07 PM, Phlip wrote:

I caught it because I just recently solved a subset of your problem.
assert_xhtml uses Nokogiri to match a subset of HTML within a page.
The code is too weird for you to use, but I indeed had to defeat all
the issues you will encounter!

next version is up

83721’s gists · GitHub

2kewt. Now you are using XmlSimple.==, so it will walk the object model
for you,
recursively. It takes care of the attribute order issue, and you then
only need
to tell XmlSimple to normalize blanks.

What about excess spaces in attributes? And what about class=‘foo bar’
matching
class=‘bar foo’? (Feel free to ignore them!..)

ahoward · March 24, 2009, 2:09am

On Mar 23, 2009, at 6:26 PM, Phlip wrote:

What about excess spaces in attributes? And what about class=‘foo
bar’ matching class=‘bar foo’? (Feel free to ignore them!..)

latest version handles the first and i’m ok with the later. feeling
like this is reasonably complete now. crazy none of the ruby xml libs
offer a good doc==other.

cheers.

a @ http://codeforpeople.com/

ahoward · March 24, 2009, 3:30am

ara.t.howard wrote:

latest version handles the first and i’m ok with the later. feeling
like this is reasonably complete now. crazy none of the ruby xml libs
offer a good doc==other.

That is supposed to be XSLT’s space. Don’t hold your breath. I’m
beginning to
suspect XSLT just might be a mission-statement without a company for it
to
guide! (-:

ahoward · March 24, 2009, 12:10am

If so, would that break over details like attributes out of order?

ah - good catch - i’ll check on that. my alternate approach,
comparing xmlsimple data structures will not, i believe, suffer from
that, but i wanted to avoid a dependancy.

I caught it because I just recently solved a subset of your problem.
assert_xhtml uses Nokogiri to match a subset of HTML within a page. The
code is
too weird for you to use, but I indeed had to defeat all the issues you
will
encounter!

ahoward · March 24, 2009, 3:53am

On Mar 23, 2009, at 8:26 PM, Phlip wrote:

That is supposed to be XSLT’s space. Don’t hold your breath. I’m
beginning to suspect XSLT just might be a mission-statement without
a company for it to guide! (-:

that is the one with ‘bacon’ isn’t it?

a @ http://codeforpeople.com/