Ruby newbie & XML - What's the simplest way to get XML data

Sorry for having such a specific question, but is there a simple way to
take the XML document below and put it into some sort of organized hash
value without first knowing anything about it?

This is an easy job using something like XML::Simple in Perl, but for
the life of me, I can’t figure out an easy way to do this in Ruby.

I’m thinking of some sort of data structure like so:

environment=>authentication=>adminGroup=>description=>‘NULL’
environment=>authentication=>adminGroup=>dataType=>‘NULL’
environment=>authentication=>adminGroup=>value=>‘NULL’
etc…
environment=>database=>databaseHost=>description=>‘NULL’
environment=>database=>databaseHost=>dataType=>‘NULL’
etc…

XML document:



NULL
NULL
NULL


NULL
NULL
NULL


NULL
NULL
NULL


NULL
NULL
NULL




NULL
NULL
NULL


NULL
NULL
NULL


NULL
NULL
NULL


NULL
NULL
NULL


NULL
NULL
NULL


TIA

[email protected] wrote:

Sorry for having such a specific question, but is there a simple way to
take the XML document below and put it into some sort of organized hash
value without first knowing anything about it?

This is an easy job using something like XML::Simple in Perl, but for
the life of me, I can’t figure out an easy way to do this in Ruby.

I think most people would think a specific question something to be
admired, not something to be apologised for!

Are you aware of the Ruby library xml-simple, which is a translation of
the Perl XML::Simple

Details at: http://xml-simple.rubyforge.org/

dj

[email protected] wrote:

Sorry for having such a specific question, but is there a simple way to
take the XML document below and put it into some sort of organized hash
value without first knowing anything about it?

This is an easy job using something like XML::Simple in Perl, but for
the life of me, I can’t figure out an easy way to do this in Ruby.

Try this:


On 16.11.2006 23:06, [email protected] wrote:

environment=>authentication=>adminGroup=>dataType=>‘NULL’
environment=>authentication=>adminGroup=>value=>‘NULL’
etc…
environment=>database=>databaseHost=>description=>‘NULL’
environment=>database=>databaseHost=>dataType=>‘NULL’
etc…

You can easily create nested Hashes with the typical idiom:

irb(main):001:0> miss = lambda {|h,k| h[k] = Hash.new(&miss)}
=> #Proc:0x003cbaa8@:1(irb)
irb(main):002:0> root = Hash.new(&miss)
=> {}
irb(main):003:0> root[:foo][:bar][:baz] = 10
=> 10
irb(main):004:0> root
=> {:foo=>{:bar=>{:baz=>10}}}

Now you just need a XML stream parser (REXML has one) and fill the Hash
while you go.

But my question to this is, why do you want it in a Hash of Hashes when
there is REXML with an equally easy accessible DOM and which knows more
about the data than nested hashes (order for example)? Also, you can
access nested elements with XPath which is a powerful query language.
Granted, accesses are likely less efficient than with the nested Hash
approach but do you need the speed?

Kind regards

robert

Paul L. wrote:



The Zen of XML :slight_smile:

Paul L. wrote:



Sorry. My Usenet reader went crazy.

[email protected] wrote:

Sorry for having such a specific question, but is there a simple way to
take the XML document below and put it into some sort of organized hash
value without first knowing anything about it?

This is an easy job using something like XML::Simple in Perl, but for
the life of me, I can’t figure out an easy way to do this in Ruby.

Try this:


#!/usr/bin/ruby -w

data = File.read(“data.xml”)

root = {}
hash = root

stack = []

data.each do |line|
line.strip!
if line =~ %r{^<\w+>$}
tag = line.sub(%r{<(.?)>},"\1")
stack.push(hash)
hash[tag] ||= {}
hash = hash[tag]
elsif line =~ %r{^</\w+>$}
hash = stack.pop
elsif line =~ %r{<(\w+)>.
?</\1>}
key,value = line.scan(%r{<(\w+)>(.*?)<}).flatten
hash[key] = value
else
puts “Error: undefined case: #{line}”
end
end

def traverse_hash(hash,tab = 0)
hash.keys.sort.each do |key|
print “#{” " * tab}"
if(hash[key].class == Hash)
puts “#{key}”
traverse_hash(hash[key],tab+1)
else
puts “#{key} = #{hash[key]}”
end
end
end

traverse_hash(root)


Output:

environment
authentication
adminGroup
dataType = NULL
description = NULL
value = NULL
bindUserID
dataType = NULL
description = NULL
value = NULL
bindUserPassword
dataType = NULL
description = NULL
value = NULL
ldapServer
dataType = NULL
description = NULL
value = NULL
database
databaseHost
dataType = NULL
description = NULL
value = NULL
databasePassword
dataType = NULL
description = NULL
value = NULL
databasePort
dataType = NULL
description = NULL
value = NULL
databaseSchema
dataType = NULL
description = NULL
value = NULL
databaseUser
dataType = NULL
description = NULL
value = NULL