Basic xml parsing question


#1

I am just starting to program and am learning by using ruby, so this
question is probably really basic.

I have a a file that I want to put the contents into a sqlite database.
This file is locally saved now, but will eventually be sent by remote
machines for collecting the data from. I will not have to retrieve the
documents, but I will have to figure out how to accept them when they
are sent to me. I have read up a bit on REXML and Nokogiri, but am not
sure where/how to get started.

If someone can give me some pointers on where to start, that would be
great.

Below is a partial sample of what I’m trying to parse. The database
tables correspond to the record names.

Thanks for any help.

<?xml version="1.0" encoding="ISO-8859-1" ?>
<calldetail srcmac="001C10F635B6" srcname="20011" srcdesc="App

Version 1.7.1" dstmac=“0004AC456B46” dstname=“60011” dstdesc=“App
Version 1.7.1” connected=“57370806” totaltime=“57372979”
responsetime=“2173” alerttime="-1" answertime=“2173” teardowntime=“1575”
avgtrans=“787” netlatency=“1874” totalbytes=“140337” mediabytes=“136944”
totalpackets=“1942” mediapackets=“1902” lostpackets=“0”
signalscore=“99.958” mediascore=“98.967” srccause=“16” dstcause=“16”
srclan=“10.10.2.2” dstlan=“192.168.1.40” />


#2

Robert K. wrote:

Below is a partial sample of what I’m trying to parse. The database
tables correspond to the record names.

What record names? Do you mean XML tags?

Yes, the tables correspond to the XML tags.

Thanks for your help and links!
Geo

Cheers

robert


#3

2009/3/26 Geo C. removed_email_address@domain.invalid:

I have a a file that I want to put the contents into a sqlite database.
This file is locally saved now, but will eventually be sent by remote
machines for collecting the data from. I will not have to retrieve the
documents, but I will have to figure out how to accept them when they
are sent to me. I have read up a bit on REXML and Nokogiri, but am not
sure where/how to get started.

Well, using REXML you have two options

  1. use REXML’s DOM parsing, i.e. let it read in the whole file and
    create an XML DOM tree in memory. Then you can traverse that DOM tree,
    for example using XPath expressions and fill the DB while you do the
    traversal.

  2. use REXML’s PUSH or PULL parsing and fill the DB while you receive
    the parsing events. In this scenario you will have to keep some state
    (e.g. information about the parent “callsummary” of a “calldetail”)
    yourself.

The second approach is usually more efficient because you never need
the whole document in memory but it is also a bit more complex to
implement.

If someone can give me some pointers on where to start, that would be
great.

Some links

http://www.germane-software.com/software/rexml/docs/tutorial.html
http://www.w3schools.com/xpath/
http://www.zvon.org/xxl/XPathTutorial/General/examples.html

Below is a partial sample of what I’m trying to parse. The database
tables correspond to the record names.

What record names? Do you mean XML tags?

Cheers

robert


#4

On Mar 26, 11:02 am, Geo C. removed_email_address@domain.invalid wrote:

Below is a partial sample of what I’m trying to parse. The database
tables correspond to the record names.

Well, you have hierarchical data (calldetails inside of callsummary)
so I’m not sure how you want to map it to your database, but here’s a
simple parsing example to get you started.

Assuming you want all the attributes in each calldetail:

doc = Nokogiri::XML.parse(xml)
doc.search(’//calldetail’).each do |call|
call.attributes.each do |key,value|
puts key + “=” + value
end
end