REXML Speed Question

Hello, I have been using REXML to extract information from an XML file
and I am having an issue with the amount of time it is taking. If I
point directly to what I want it is pretty fast. The issue arises when
I have to grab a reference id, then research for that id to get another
id, until I finally get to the piece of information I want. This is
what a snippet my code currently looks like:


result = []
wall_refs1 = XPath.match( $doc,
“doc:iso_10303_28/uos/IfcWallStandardCase//*[@pos=‘1’]” )

wall_refs1 = grab_id(wall_refs1,‘ref’)
#grab_id simply puts the ref’s id and puts them into an array
#output from this would be [[“i1741”]]

wall_ref2 = []
wall_refs1.each do |ref|
x =
REXML::XPath.first($doc,"//*[@id=’#{ref}’]//IfcExtrudedAreaSolid").attribute(“ref”).value
wall_ref2 << x
end
#Output [[“i1738”]]

wall_depth = []
wall_ref2.each do |ref|
x = REXML::XPath.match($doc,"//*[@id=’#{ref}’]//Depth").map {|element|
element.text}
wall_depth << x
end
#Output [[“120.”]]

wall_depth_final = wall_depth.map do |arr|
arr.map do |arr2|
#this is simply converting to float and rounding to 2 decimles
arr2.to_f.round_to(2)
end
end

wall_depth_final
#Output [[“120.00”]

The problem with doing this is that it takes substantial time for the
computer to run this, doing this for say 200 elements can take 25
minutes (I would be guessing the reason it takes so long to run is
because as some of the xml files are 10,000+ lines and I image it takes
a while to comb through that). I have to start from the first location
and work my way to the final one, and simply cannot run a search to grab
//depth unfortunately.

Is there a quicker way of accomplishing the same thing, or is time
always going to be a burden?

Thank you for your time.

This would be the xml I am reading:

120.

On Apr 7, 2011, at 21:00 , Kyle X. wrote:

Hello, I have been using REXML to extract information from an XML file
and I am having an issue with the amount of time it is taking. If I
point directly to what I want it is pretty fast. The issue arises when
I have to grab a reference id, then research for that id to get another
id, until I finally get to the piece of information I want. This is
what a snippet my code currently looks like:

Switch to nokogiri and you’ll be much much happier.

For larger XML documents SAX parsing can really improve performance
(specifically because SAX parsing doesn’t create an entire DOM
structure, it only extracts the bits you are interested in). Programming
with a SAX parser is very different though :slight_smile:

You can also switch to another library for handling your XML, the most
popular library (at least to my knowledge) is Nokogiri
(http://nokogiri.org/) and it is a great deal faster than REXML

Thanks for the info. I am going to try Nokogiri, if I can only figure
out how to get it to work in SketchUp… There is a surprisingly a
dearth of information on the topic, after a few hours of trying to find
out online… Any chance anyone know how?

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs