Ruby / rexml / xpath bug?

I’m doing something I would think is very straight-forward but am
getting the strangest result. I’m wondering if anyone could help to
let me know if this is a bug or if I’m doing something I can’t see…

I copied a Google Maps script from Google Maps with Rails and Ajax
straight from code. It requests a geocoding from Google and parses
the XML using rexml. When it gets to the address with 2 results, the
“each” iterator is looping through, but on the second element is
giving me info from the first. below is the script exactly.

require ‘open-uri’
require ‘rexml/document’

Retrieve geocode information for all records in the Stores table

task :google_geocode => :environment do
api_key=GOOGLE_MAPS_KEY

(Store.find :all).each do |store|

  puts "\nStore: #{store.name}"
  puts "Source Address: #{store.full_address}"

  xml=open("http://maps.google.com/maps/geo?

q=#{CGI.escape(store.full_address)}&output=xml&key=#{api_key}").read
doc=REXML::Document.new(xml)

  puts "Status: "+doc.elements['//kml/Response/Status/code'].text

  if doc.elements['//kml/Response/Status/code'].text != '200'
    puts "Unable to parse Google response for #{store.name}"
  else
    doc.root.each_element('//Response') do |response|
      response.each_element('//Placemark') do |place|
        #there are 2 "Placemarks" in one particular "Response"
        lng,lat=place.elements['Point/

coordinates’].text.split(’,’)
puts " Result Address: " << place.elements[’//
address’].text
# the address, lat and lng come from the 1st element twice
puts " ID: " << place.attributes[‘id’]
# … but the ‘id’ is correct.
puts " Latitude: #{lat}"
puts " Longitude: #{lng}"
end # end each place
end # end each response
end # end if result == 200
end # end each store
end # end rake task

What I don’t understand, is that the XML for each “place” is correct,
but the place.elements[’//address’] keeps finding the previous
iterations information. How can this be?

Thank you.

On 15.09.2008 21:38, altinsel wrote:

require ‘open-uri’

      response.each_element('//Placemark') do |place|
        #there are 2 "Placemarks" in one particular "Response"
        lng,lat=place.elements['Point/

coordinates’].text.split(’,’)

This seems weird: you search for potential multiple elements but use
method “text” on the result. Not sure what this is going to yield but
I’d rather iterate.

        puts "  Result Address: " << place.elements['//

address’].text

see above

What I don’t understand, is that the XML for each “place” is correct,
but the place.elements[’//address’] keeps finding the previous
iterations information. How can this be?

Hm, weird. Are you sure the XML is as expected?

Btw, you can combine searching for //Response and //Placemark in one
since you do not do anything with “Response”.

Kind regards

robert

The funny result is the Ron Jon store in Orange, CA (second from
last)…

Thank you for your reply, Robert. Here is the code with output and
puts of the XML (the Stores are coming from a local database). Does
this help?

##CODE##

require ‘open-uri’
require ‘rexml/document’

Retrieve geocode information for all records in the Stores table

task :google_geocode => :environment do
api_key=GOOGLE_MAPS_KEY

(Store.find :all).each do |store|

  puts "\nStore: #{store.name}"
  puts "Source Address: #{store.full_address}"

  xml=open("http://maps.google.com/maps/geo?

q=#{CGI.escape(store.full_address)}&output=xml&key=#{api_key}").read
doc=REXML::Document.new(xml)

  puts "Status: "+doc.elements['//kml/Response/Status/code'].text

  if doc.elements['//kml/Response/Status/code'].text != '200'
    puts "Unable to parse Google response for #{store.name}"
  else
    puts doc
    doc.root.each_element('//Response') do |response|
      puts "****"
      puts response
      puts "****"
      response.each_element('//Placemark') do |place|
        # puts "****"
        # puts place.elements['//Point'];
        # puts "****"
        lng,lat=place.elements['Point/

coordinates’].text.split(’,’)
puts " Result Address: " << place.elements[’//
address’].text
puts " ID: " << place.attributes[‘id’]
puts " Latitude: #{lat}"
puts " Longitude: #{lng}"
puts “^^^^^^^^^^^^^^^^^^^^^^^^^”
end # end each place
end # end each response
end # end if result == 200
end # end each store
end # end rake task

##OUTPUT##

rake google_geocode
(in /Users/acar/Documents/Design/Code/magicmaponrails)

Store: The Original Ron Jon Surf Shop
Source Address: 901 Central Avenue, Long Beach Island, NJ, 08008
Status: 200

<?xml version='1.0' encoding='UTF-8'?>901 Central Ave, Beach Haven, NJ 08008, USA</
address>US</
CountryNameCode>NJ</
AdministrativeAreaName>Beach Haven</
LocalityName>901 Central Ave</
ThoroughfareName>08008</
PostalCodeNumber></
Country>-74.107299,39.756819,0</
coordinates>

901 Central Avenue, Long Beach Island, NJ, 08008</
name>200geocode</
Status>901 Central Ave, Beach Haven, NJ
08008, USAUS</
CountryNameCode>NJ</
AdministrativeAreaName>Beach Haven</
LocalityName>901 Central Ave</
ThoroughfareName>08008</
PostalCodeNumber></
Country>-74.107299,39.756819,0</
coordinates>


Result Address: 901 Central Ave, Beach Haven, NJ 08008, USA
ID: p1
Latitude: 39.756819
Longitude: -74.107299
^^^^^^^^^^^^^^^^^^^^^^^^^

Store: One of aKind Ron Jon Surf Shop
Source Address: 4151 North Atlantic Avenue, Cocoa Beach, FL, 32931
Status: 200

<?xml version='1.0' encoding='UTF-8'?>4151 N Atlantic Ave,
Cocoa Beach, FL 32931, USAUS</
CountryNameCode>FL</
AdministrativeAreaName>Cocoa Beach</
LocalityName>4151 N Atlantic Ave</
ThoroughfareName>32931</
PostalCodeNumber></
Country>-80.608106,28.356747,0</
coordinates>

4151 North Atlantic Avenue, Cocoa Beach, FL, 32931</
name>200geocode</
Status>4151 N Atlantic Ave, Cocoa Beach,
FL 32931, USAUS</
CountryNameCode>FL</
AdministrativeAreaName>Cocoa Beach</
LocalityName>4151 N Atlantic Ave</
ThoroughfareName>32931</
PostalCodeNumber></
Country>-80.608106,28.356747,0</
coordinates>


Result Address: 4151 N Atlantic Ave, Cocoa Beach, FL 32931, USA
ID: p1
Latitude: 28.356747
Longitude: -80.608106
^^^^^^^^^^^^^^^^^^^^^^^^^

Store: Ron Jon Surf Shop - Sunrise
Source Address: 2610 Sawgrass Mills Circle, Sunrise, FL, 33323
Status: 200

<?xml version='1.0' encoding='UTF-8'?>2610 Sawgrass Mills Cir,
Fort Lauderdale, FL 33323, USAUS</
CountryNameCode>FL</
AdministrativeAreaName>Fort Lauderdale</
LocalityName>2610 Sawgrass Mills Cir</
ThoroughfareName>33323</
PostalCodeNumber></
Country>-80.315710,26.154207,0</
coordinates>

2610 Sawgrass Mills Circle, Sunrise, FL, 33323</
name>200geocode</
Status>2610 Sawgrass Mills Cir, Fort
Lauderdale, FL 33323, USAUS</
CountryNameCode>FL</
AdministrativeAreaName>Fort Lauderdale</
LocalityName>2610 Sawgrass Mills Cir</
ThoroughfareName>33323</
PostalCodeNumber></
Country>-80.315710,26.154207,0</
coordinates>


Result Address: 2610 Sawgrass Mills Cir, Fort Lauderdale, FL 33323,
USA
ID: p1
Latitude: 26.154207
Longitude: -80.315710
^^^^^^^^^^^^^^^^^^^^^^^^^

Store: Ron Jon Surf Shop - Orlando
Source Address: 5160 International Drive, Orlando, FL, 32819
Status: 200

<?xml version='1.0' encoding='UTF-8'?>5160 International Dr,
Orlando, FL 32819, USAUS</
CountryNameCode>FL</
AdministrativeAreaName>Orlando</
LocalityName>5160 International Dr</
ThoroughfareName>32819</
PostalCodeNumber></
Country>-81.450308,28.469598,0</
coordinates>

5160 International Drive, Orlando, FL, 32819</
name>200geocode</
Status>5160 International Dr, Orlando, FL
32819, USAUS</
CountryNameCode>FL</
AdministrativeAreaName>Orlando</
LocalityName>5160 International Dr</
ThoroughfareName>32819</
PostalCodeNumber></
Country>-81.450308,28.469598,0</
coordinates>


Result Address: 5160 International Dr, Orlando, FL 32819, USA
ID: p1
Latitude: 28.469598
Longitude: -81.450308
^^^^^^^^^^^^^^^^^^^^^^^^^

Store: Ron Jon Surf Shop - Key West
Source Address: 503 Front Street`, Key West, FL, 33040
Status: 200

<?xml version='1.0' encoding='UTF-8'?>503 Front St, Key West, FL 33040,
USAUS</
CountryNameCode>FL</
AdministrativeAreaName>Key West</
LocalityName>503 Front St</
ThoroughfareName>33040</
PostalCodeNumber></
Country>-81.805852,24.560319,0</
coordinates>

503 Front Street`, Key West, FL, 33040</
name>200geocode</
Status>503 Front St, Key West, FL 33040,
USAUS</
CountryNameCode>FL</
AdministrativeAreaName>Key West</
LocalityName>503 Front St</
ThoroughfareName>33040</
PostalCodeNumber></
Country>-81.805852,24.560319,0</
coordinates>


Result Address: 503 Front St, Key West, FL 33040, USA
ID: p1
Latitude: 24.560319
Longitude: -81.805852
^^^^^^^^^^^^^^^^^^^^^^^^^

Store: Ron Jon Surf Shop - California
Source Address: 20 City Blvd., Orange, CA, 92868
Status: 200

<?xml version='1.0' encoding='UTF-8'?>20 City Blvd W, Orange, CA 92868,
USAUS</
CountryNameCode>CA</
AdministrativeAreaName>Orange</
LocalityName>20 City Blvd W</
ThoroughfareName>92868</
PostalCodeNumber></
Country>-117.892764,33.782893,0</
coordinates>20 City
Blvd E, Orange, CA 92868, USAUS</
CountryNameCode>CA</
AdministrativeAreaName>Orange</
LocalityName>20 City Blvd E</
ThoroughfareName>92868</
PostalCodeNumber></
Country>-117.891102,33.784368,0</
coordinates>

20 City Blvd., Orange, CA, 92868</
name>200geocode</
Status>20 City Blvd W, Orange, CA 92868,
USAUS</
CountryNameCode>CA</
AdministrativeAreaName>Orange</
LocalityName>20 City Blvd W</
ThoroughfareName>92868</
PostalCodeNumber></
Country>-117.892764,33.782893,0</
coordinates>20 City
Blvd E, Orange, CA 92868, USAUS</
CountryNameCode>CA</
AdministrativeAreaName>Orange</
LocalityName>20 City Blvd E</
ThoroughfareName>92868</
PostalCodeNumber></
Country>-117.891102,33.784368,0</
coordinates>


Result Address: 20 City Blvd W, Orange, CA 92868, USA
ID: p1
Latitude: 33.782893
Longitude: -117.892764
^^^^^^^^^^^^^^^^^^^^^^^^^
Result Address: 20 City Blvd W, Orange, CA 92868, USA
ID: p2
Latitude: 33.784368
Longitude: -117.891102
^^^^^^^^^^^^^^^^^^^^^^^^^

Store: Ron Jon Cape Caribe Resort
Source Address: 1000 Shorewood Drive, Cape Canaveral, FL, 32920
Status: 200

<?xml version='1.0' encoding='UTF-8'?>1000 Shorewood Dr, Cape
Canaveral, FL 32920, USAUS</
CountryNameCode>FL</
AdministrativeAreaName>Cape Canaveral</
LocalityName>1000 Shorewood Dr</
ThoroughfareName>32920</
PostalCodeNumber></
Country>-80.598021,28.405217,0</
coordinates>

1000 Shorewood Drive, Cape Canaveral, FL, 32920</
name>200geocode</
Status>1000 Shorewood Dr, Cape Canaveral,
FL 32920, USAUS</
CountryNameCode>FL</
AdministrativeAreaName>Cape Canaveral</
LocalityName>1000 Shorewood Dr</
ThoroughfareName>32920</
PostalCodeNumber></
Country>-80.598021,28.405217,0</
coordinates>


Result Address: 1000 Shorewood Dr, Cape Canaveral, FL 32920, USA
ID: p1
Latitude: 28.405217
Longitude: -80.598021
^^^^^^^^^^^^^^^^^^^^^^^^^

I’m surprised it works at all with the XPaths all being absolute like
that. An XPath expression starting with ‘//’ means start at the
document root. My guess is that you are getting both addresses each
time, and the call to text() just prints the first one.

Try ‘.//address’ or just ‘address’ instead.

– Mark.

On 15.09.2008 23:48, altinsel wrote:

Mark,

Thank you. The ‘.//’ syntax seems to work. (I’m very happy about
that).

I’d still rather combine the two into a single iteration.

Now if you don’t mind I’d like to ask: as far as I understand it, the
“each_element” method should only be plucking a specific branch
(“place”) off the tree, and that branch is a new, stand-alone tree,
and shouldn’t even have the first element’s data, let alone the rest
of the tree. I even So how was that information even getting in
there?

http://www.w3schools.com/xpath/xpath_syntax.asp

Basically a node knows about the whole document (it knows its parent as
well). So you cannot only look down but up as well. This is not too
unusual (just think about filesystems).

Cheers

robert

On Sep 15, 5:48 pm, altinsel [email protected] wrote:

there?
In XPath, the prefix ‘//’ means start from the top of the tree, not
the ‘current’ node or “branch”.

And I agree with Robert, the code is far from ideal; there is far too
much iteration. The code could be much simpler and more efficient. See
my diatribe on this here:
http://markthomas.org/2008/08/22/eschew-xpathic-iteration/

– Mark.

Mark,

Thank you. The ‘.//’ syntax seems to work. (I’m very happy about
that).

Now if you don’t mind I’d like to ask: as far as I understand it, the
“each_element” method should only be plucking a specific branch
(“place”) off the tree, and that branch is a new, stand-alone tree,
and shouldn’t even have the first element’s data, let alone the rest
of the tree. I even So how was that information even getting in
there?