Trying to pull data out of XML doc using REXML

I am trying to get some info from an XML doc that kinda looks like:

... ... ... ... ... ...

You get the picture…

I’m trying to write the publication, author list and keyword list to a
database where the authors and keywords are associated with the
publication using a has many thru relationship.

The code below works to put the publication in the database, but will
only write the first author or keyword. I assume I am using REXML
incorrectly, but can’t figure out how to fix it.

Thanks for any help you can offer…

oh, and this is running as a rake task

task :toDatabase => :environment do

require “rexml/document”
file = File.new(“public/xml_files/pubFile2009-09-14.xml”)
doc = REXML::Document.new file
include REXML

#root = doc.root

doc.elements.each('//PUB' ) do |pub|

  thispub = Publication.new(
    :title => pub.elements['TITLE' ].nil? ? nil : pub.elements

[‘TITLE’].text,
:abstract => pub.elements[‘ABSTRACT’ ].nil? ? nil :
pub.elements[‘ABSTRACT’].text,
:volume => pub.elements[‘VOLUME’ ].nil? ? nil : pub.elements
[‘VOLUME’].text,

)
thispub.save

  authors = pub.elements['AUTHORS'] rescue nil
  if authors
    xml_author = authors.elements['AUTHOR']
    if xml_author
      xml_author.each do |author|

        if author
          @name = author.to_s
          if @name then
            @nameArr = @name.split(', ')
            thisauth = Author.new(
              :first_name => @nameArr[1],
              :last_name => @nameArr[0],
              :ours => true
          )
          end
        end

        thisauth.save
        thispub.authors << thisauth
      end
    end
  end

  thispub.save


  pub.elements.each('KEYWORDS') do |tag|

   if tag.elements['KEYWORD']
     thiskeyword = Keyword.find_or_create_by_keyword(tag.elements

[‘KEYWORD’].text)
thispub.keywords << thiskeyword
end

  end

  thispub.save
end

end

Quick tip. Look at Nokogiri before investing much time in parsing with
REXML.

http://nokogiri.rubyforge.org/nokogiri/

Besides being faster, the parsing syntax is nicer. Here’s some
example code for pulling apart Excel exported XML:

starting around line 26

You probably want to look for examples of use of the “at” and “xpath”
methods, too.

Cheers,
Walter