REXML, each_element and XPath


#1

Hi!

while playing with REXML I try to use each_element with an x-path
expression involving an attribute (doc/element/@attribute) but can’t
get it to work this way (it works perfectly fine with both XPath.each
and each_element applied on doc/element without @attribute).

here’s my test suite to illustrate this, is there some obvious mistake
that I cannot see ? I’ve read in older posts that each_element did not
recurse, would it be the explanation ?

any insight most welcome!

kind regards

Thibaut

============

require “test/unit”
require “rexml/document”
include REXML

class XmlTests < Test::Unit::TestCase

fail

def test_each_element_with_attribute
count = 0
@doc.each_element(“doc/element/@attribute”) { |n| count = count+1 }
assert_equal 1,count
end

pass

def test_each_element_without_attribute
count = 0
@doc.each_element(“doc/element”) { |n| count = count+1 }
assert_equal 1,count
end

pass

def test_xpath
count = 0
XPath.each(@doc,“doc/element/@attribute”) { |n| count = count+1 }
assert_equal 1,count
end

def setup
@doc = Document.new(<<EOF



EOF
)
end

end


#2

It seems that when xpathing with an attribute @doc.each_element
returns an array containting the attributes:

irb(main):009:0> @doc = REXML::Document.new(<<EOF
irb(main):010:1" </
doc>
irb(main):011:1" EOF
irb(main):012:1> )
irb(main):016:0> @doc.each_element(“doc/element/@attribute”)
=> [attribute=‘value’, attribute=‘value’]

Not sure if this is a bug. I’m using the version of REXML that comes
with 1.8.2 on Windows.

Farrel


#3

I’ve done some experimentation and I think the problem is as follows.
/doc/element/@attribute is a search for the attribute node, not the
element, so using it with doc.each_element isn’t strictly
‘semantically’ correct. If you want to find an element with an
attribute you should use the /doc/element[@attribute] xpath string.

irb(main):002:0> doc = REXML::Document.new(<<EOF
irb(main):003:1" "
irb(main):004:1" EOF
irb(main):005:1> )
irb(main):006:0> doc.each_element("/doc/element[@attribute]"){|n| puts
[“attribute”]}
value
=> []


#4

Hi Farrel and thanks for your feedback!

I’ve searched more and finally found a note here:
http://www.ruby-doc.org/stdlib/libdoc/rexml/rdoc/classes/REXML/Elements.html#M002906

“Note that XPaths are automatically filtered for Elements, so that
non-Element children will not be yielded”

So basically that’s what you said : Attribute instances (non-Element)
cannot be retrieved directly through each_element.

I either have to seek for elements then grab the attribute like you
did, or stick with XPath.each which does the job pretty well.

thanks!

Thibaut