Help with Classes and XML

I am still fairly new to Ruby and definitely new to OOP. In the past I
created a procedural PHP script to parse an XML file, and then download
the sources identified in the xml.
Now, I am trying to redo this in Ruby and toying with doing this with
classes for my own edification. After some effort I was able to parse
the XML with REXML, though I had to
create 2 classes to represent the XML. Is there another way to capture
the XML in a single class or should I nest the classes? Also, generally
speaking is OOP the right approach for this?

Any advice or suggestions would be greatly appreciated.

Brad

Here is what I have so far.

–>manifest/XML file read into ARGV

3/23/2012 5:00:00 PM
[email protected]
875bbfd0-25d6-441d-9772-7e0228785076

ftp://ftp.ftpserver.com/Media/12-0000/ user pass /Media/12-0000/ file1.mov ftp://ftp.archiveserver.com/Media/11-0000/ user1 pass1 /Media/11-0000/ file2.pdf

–>manifest_id_class.rb
#This class gets the unique elements of each manifest
class ManifestID
attr_reader :expiration, :email, :packageid

def initialize(expiration, email, packageid)
    @expiration = expiration
    @email = email
    @packageid = packageid
end

end

–source_class.rb
#This class gets the recurring elements “sources” of each manifest
class Source
attr_accessor :source_url, :source_username, :source_password,
:source_lpath, :source_filename

def initialize(source_url, source_username, source_password,

source_lpath, source_filename)
@source_url = source_url
@source_username = source_username
@source_password = source_password
@source_lpath = source_lpath
@source_filename = source_filename
end
end

–XMLReader.rb
require “/path/to/manifest_id_class.rb”
require “/path/to/source_class.rb”
require ‘rexml/document’
include REXML

class ManifestReader

def initialize
    @manifest_guid = []
    @manifest_sources = []
end

def parse_manifest(manifest_file)
    sourcepackage = (Document.new File.new(manifest_file)).root

    expiration = sourcepackage.elements["Expiration"].text
    email = sourcepackage.elements["Email"].text
    packageid = sourcepackage.elements["PackageID"].text

    @manifest_guid << ManifestID.new(expiration, email, webservice,

packageid)

    sources = sourcepackage.elements.to_a("Source")
    sources.each do |source|
        source_url = source.elements["URL"].text.strip
        source_username = source.elements["UserName"].text.strip
        source_password = source.elements["Password"].text.strip
        source_lsource.path = source.elements["LPath"].text.strip
        source_filename = source.elements["FileName"].text.strip

        @manifest_sources << source.new(source_url, source_username,

source_password, source_lpath, source_filename)
end
end

def manifest_properties
    @manifest_guid.each do |x|
         puts x.email
         puts x.packageid
         puts x.expiration
     end
end

def manifest_sources
    @manifest_sources.each do |x|
        puts x.source_url
    end
end

end

–>process_manifest.rb
require “/path/to/XMLReader.rb”

reader = ManifestReader.new

ARGV.each do |manifest_file|
STDERR.puts “Processing #{manifest_file}”
reader.parse_manifest(manifest_file)
end
reader.manifest_properties
reader.manifest_reels
#create ftp routine with reader output

There are XML parsers that are easier to use than REXML, namely
Nokogiri. [nokogiri | RubyGems.org | your community gem host]

Personally, if I had to parse this, I’d use xml-simple:
[xml-simple | RubyGems.org | your community gem host]. There may be cases where
“heavy” class-based structure is better, but IMO usually it isn’t :slight_smile:

This script:

require ‘xmlsimple’
require ‘pp’
pp XmlSimple.xml_in File.read ‘xml.xml’ # xml.xml is your input file

Outputs:

{“Expiration”=>[“3/23/2012 5:00:00 PM”],
“Email”=>[“[email protected]”],
“PackageID”=>[“875bbfd0-25d6-441d-9772-7e0228785076”],
“Source”=>
[{“URL”=>[“ftp://ftp.ftpserver.com/Media/12-0000/”],
“UserName”=>[“user”],
“Password”=>[“pass”],
“LPath”=>[“/Media/12-0000/”],
“FileName”=>[“file1.mov”]},
{“URL”=>[“ftp://ftp.archiveserver.com/Media/11-0000/”],
“UserName”=>[“user1”],
“Password”=>[“pass1”],
“LPath”=>[“/Media/11-0000/”],
“FileName”=>[“file2.pdf”]}]}

– Matma R.