On Sep 5, 2011, at 5:42 PM, Bob Aiello wrote:
Essentially, I have hundreds of XML files like this one that contain
many many name value pairs. I am concerned that the value is defined
differently (actually I have seen this) in one or more of the XML. I
want to take an xml file and then parse the name/value pairs into a
list. Then I want to check that list against all of the other XML in the
system that have the same name/value pairs.
For your consideration, below is how I would write a script to handle
this. It creates a Hash storing names and key/value pairs; when the same
key is seen again with a new value, it keeps track of all values seen as
an array. The “SourceFile” module associates with each value string the
file that it was defined in, so that you can later see where the values
were defined. Use a module for this is both tricky
collider.rb
require 'nokogiri'
# Perhaps use Marshal to load this from a file if it exists,
# and save out the values seen so far at the end of the run.
$all_values = {}
# Find your file(s) to analyze however you want here
files = %w[ my.xml ]
module SourceFile
attr_accessor :source_file
end
files.each do |file|
doc = Nokogiri::XML(IO.read(file))
doc.remove_namespaces!
# Find every <name> that has a <value> sibling
doc.xpath('//property/name[following-sibling::value]').each do
|name|
value = name.at_xpath(‘following-sibling::value’).text
# Record where this value came from
value.extend(SourceFile); value.source_file = file
name = name.text
if $all_values.key?(name)
old = $all_values[name]
unless old==value
warn "#{name} is #{old.inspect} and #{value.inspect}"
$all_values[name] = [*old,value]
end
else
$all_values[name] = value
end
end
end
#=> core.app.root is "myroot" and "bigroot"
#=> core.app.root is ["myroot", "bigroot"] and "sarsaparilla root"
# Print any keys that point to an array of values...
$all_values.select{ |key,val| val.is_a?(Array) }.each do
|key,values|
puts “#{key}:”
puts values.map{ |v| “%20s: ‘%s’” % [v.source_file,v] }
end
#=> core.app.root:
#=> my.xml: ‘myroot’
#=> my.xml: ‘bigroot’
#=> my.xml: ‘sarsaparilla root’
my.xml
<?xml version="1.0" encoding="UTF-8"?>
<product-state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="urn://mycompany.com/ia/product-state"
xsi:type=“product-state”>
core.was.home
/usr/IBM/WebSphere/AppServer1/profiles/AppSrvQA
core.was.username
admin
core.was.password
password
core.application.name
myapp
core.app.root
myroot
core.app.root
bigroot
core.was.password
password
core.app.root
sarsaparilla root