Parsing poorly written XML

I’ve found an application which dumps all the registry settings for
installed applications into an .xml file for me (you can find it as
“myuninstall” via google).

I can’t figure out the ruby code I need from REXML to do the following:
-find a particular “product_name” in the xml file
-if found, save off the “uninstall_string” so I can execute it on the
commandline and uninstall the app.

I’m a newbie to Ruby, and I’m even newer to parsing XML with Ruby, so
I’m struggling a bit. It seems there should be a way to navigate within
that “item” after I match on the product name, but I can’t figure out
how. Any guidance would be very VERY appreciated!

Here’s a snippet of the problem xml:

<?xml version="1.0" encoding="ISO-8859-1" ?>

<installed_app version=“1.0”>

<entry_name>Devicescape</entry_name>
<product_name>DevicescapeDesktop</product_name>
2.0.5
Devicescape
DevicescapeDesktop
No
Yes
<installation_folder>C:\Program
Files\Devicescape\Devicescape</installation_folder>
<web_site>http://www.devicescape.com/</web_site>
<installation_date>1/22/2008 4:42:14 PM</installation_date>
<uninstall_string>MsiExec.exe
/I{C3BFE4FC-C9F6-494F-B3AB-ABD75556DDC5}</uninstall_string>
<quiet_uninstall>No</quiet_uninstall>
<registry_key>{C3BFE4FC-C9F6-494F-B3AB-ABD75556DDC5}</registry_key>
Windows Installer
<root_key>HKEY_LOCAL_MACHINE</root_key>

On Jan 23, 4:07 pm, Phrogz [email protected] wrote:

#=> 42
17


foo
17

Or, using XPath more liberally (with the same DATA section as quoted
above):

require ‘rexml/document’
doc = REXML::Document.new( DATA.read )
p REXML::XPath.match( doc, “//item[name=‘foo’]/value/text()” )
#=> [“42”, “17”]

On Jan 23, 3:35 pm, Patrick C. [email protected] wrote:

I’ve found an application which dumps all the registry settings for
installed applications into an .xml file for me (you can find it as
“myuninstall” via google).

I can’t figure out the ruby code I need from REXML to do the following:
-find a particular “product_name” in the xml file
-if found, save off the “uninstall_string” so I can execute it on the
commandline and uninstall the app.

require ‘rexml/document’
doc = REXML::Document.new( DATA.read )

Find ‘item’ elements at any level

that have a child ‘name’ element whose text value is ‘foo’

doc.each_element( “//item[name=‘foo’]” ){ |item|

Find every child element of the item element

whose name is ‘value’, get the first one, and get its text

puts item.get_elements( “value” ).first.text
}
#=> 42
#=> 17

END


foo
42


bar
17


foo
17

Gavin K. wrote:

On Jan 23, 3:35 pm, Patrick C. [email protected] wrote:

I’ve found an application which dumps all the registry settings for
installed applications into an .xml file for me (you can find it as
“myuninstall” via google).

I can’t figure out the ruby code I need from REXML to do the following:
-find a particular “product_name” in the xml file
-if found, save off the “uninstall_string” so I can execute it on the
commandline and uninstall the app.

require ‘rexml/document’
doc = REXML::Document.new( DATA.read )

Find ‘item’ elements at any level

that have a child ‘name’ element whose text value is ‘foo’

doc.each_element( “//item[name=‘foo’]” ){ |item|

Find every child element of the item element

whose name is ‘value’, get the first one, and get its text

puts item.get_elements( “value” ).first.text
}
#=> 42
#=> 17

END


foo
42


bar
17

Wow! That worked on the first try! I knew there had to be a way, but
I’m trying to whip this script out and I didn’t want to spend a week
studying XML parsing options in Ruby to do it. You’ve saved me a bunch
of effort. Now I just need to figure out how to gracefully bail out
and continue on with the rest of the script if “foo” isn’t there at all.
Uhm… I seem to be having trouble with that piece too. I’m sure it’s
easier than the first problem, but maybe it’s just 'cause my brain is
cooked right now…

Thanks a bunch!

On Jan 23, 5:08 pm, Patrick C. [email protected] wrote:

Gavin K. wrote:
Wow! That worked on the first try! I knew there had to be a way, but
I’m trying to whip this script out and I didn’t want to spend a week
studying XML parsing options in Ruby to do it. You’ve saved me a bunch
of effort. Now I just need to figure out how to gracefully bail out
and continue on with the rest of the script if “foo” isn’t there at all.
Uhm… I seem to be having trouble with that piece too. I’m sure it’s
easier than the first problem, but maybe it’s just 'cause my brain is
cooked right now…

The first piece of code I wrote has a potential to be cranky.

doc.each_element( “//item[name=‘foo’]” ){ |item|
puts item.get_elements( “value” ).first.text
}

If no …foo elements can be found, the block will
never be called. However, if it finds such an element, but the
doesn’t have a child, then item.get_elements( “value” ) will
return an empty array. Calling .first on an empty array will return
nil, and then you’ll have an error when you try to run the ‘text’
method of nil.

You could modify the above code to guard for this:
doc.each_element( “//item[name=‘foo’]” ){ |item|
first_value = item.get_elements( “value” ).first
if first_value
puts first_value.text
end
}

However, the second (single XPath query) solution I wrote is both
shorter and also more fault tolerant. If you assume that there’s only
one that will match the foo criterion, then you
can change from REXML::XPath.match (which returns an array, possibly
empty) to REXML::XPath.first (which returns either a single value, or
nil).

Something like:
value = REXML::XPath.first( doc, “//item[name=‘foo’]/value/text()” )
if value
# do my stuff here
end

Or, to use your exact case:

Returns the uninstall command run (if found),

or nil if the necessary info couldn’t be found.

def uninstall_product( xml_string, prod_name )
require ‘rexml’
include REXML
uninstall_command = XPath.first(
Document.new( xml_string ),
“//item[product_name=‘#{prod_name}’]/uninstall_string/text()”
)
if uninstall_command
# Do the uninstall; maybe as simple as:
#{uninstall_command}
else
warn "Either product ‘#{prod_name}’ doesn’t exist, " +
“or it doesn’t have an uninstall string.”
end
uninstall_command
end