Trying to pull data out of XML doc using REXML

I am trying to get some info from an XML doc that kinda looks like:

... ... ... ... ... ...

You get the picture…

I’m trying to write the publication, author list and keyword list to a
database where the authors and keywords are associated with the
publication using a has many thru relationship.

The code below works to put the publication in the database, but will
only write the first author or keyword. I assume I am using REXML
incorrectly, but can’t figure out how to fix it.

Thanks for any help you can offer…

oh, and this is running as a rake task

task :toDatabase => :environment do

require “rexml/document”
file = File.new(“public/xml_files/pubFile2009-09-14.xml”)
doc = REXML::Document.new file
include REXML

#root = doc.root

doc.elements.each('//PUB' ) do |pub|

  thispub = Publication.new(
    :title => pub.elements['TITLE' ].nil? ? nil : pub.elements

[‘TITLE’].text,
:abstract => pub.elements[‘ABSTRACT’ ].nil? ? nil :
pub.elements[‘ABSTRACT’].text,
:volume => pub.elements[‘VOLUME’ ].nil? ? nil : pub.elements
[‘VOLUME’].text,

)
thispub.save

  authors = pub.elements['AUTHORS'] rescue nil
  if authors
    xml_author = authors.elements['AUTHOR']
    if xml_author
      xml_author.each do |author|

        if author
          @name = author.to_s
          if @name then
            @nameArr = @name.split(', ')
            thisauth = Author.new(
              :first_name => @nameArr[1],
              :last_name => @nameArr[0],
              :ours => true
          )
          end
        end

        thisauth.save
        thispub.authors << thisauth
      end
    end
  end

  thispub.save


  pub.elements.each('KEYWORDS') do |tag|

   if tag.elements['KEYWORD']
     thiskeyword = Keyword.find_or_create_by_keyword(tag.elements

[‘KEYWORD’].text)
thispub.keywords << thiskeyword
end

  end

  thispub.save
end

end

Quick tip. Look at Nokogiri before investing much time in parsing with
REXML.

http://nokogiri.rubyforge.org/nokogiri/

Besides being faster, the parsing syntax is nicer. Here’s some
example code for pulling apart Excel exported XML:

http://github.com/kete/kete/blob/master/lib/workers/excel_based_importer_worker.rb

starting around line 26

You probably want to look for examples of use of the “at” and “xpath”
methods, too.

Cheers,
Walter

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs