Geography/Map Classes?


#1

Hello,

I’m still very new to Ruby, however am growing to like the language more
and more as I use it.

Presently, I’m looking for some classes(I’m used to java, sorry, not
sure what you’d call them in Ruby) that can measure geographic distances
based on information like zip, city, state, etc.

An example of the sort of the functionality I’m looking for would be the
“find locations withing X miles of zip code 01234” searches you can do
with google maps or mapquest.


#2

Below are my zipcode notes. None of this originates from me, so I
really can’t help much if you should have questions. Caveat emptor.
But there’s probably a gem or two hidden somewhere in all the cruft.


You can get zip code data and a MySQL schema here:

http://www.sanisoft.com/ziploc/

There is some PHP code there for searching that can easily be ported to
Ruby:

class ZipCode < ActiveRecord::Base

     def search(radius = 30)
             ZipCode.find_by_sql("SELECT * FROM zip_codes WHERE

(POW((69.1*(lon-#{self.lon})cos(#{self.lat}/57.3)),2)+
POW((69.1
(lat-#{self.lat})),2))<(#{radius}*#{radius})")
end
end

OR

This can be done through the use of free Geocoding websites and a little
math.

One such site is GeoCoder.us (Web based, or API for a fee …I beleive),
where you can input the addresses (not just zip codes… which is what
you asked for, but its close. “1 main street,
ZIP CODE” is usually a fair assumption.

http://geocoder.us/

Once you have both longitude/latitude coordinates for the addresses, you
can then calculate the distance between the two. Here are two links
which explain how:

http://jan.ucc.nau.edu/~cvm/latlongdist.html
http://www.mathforum.com/library/drmath/view/51711.html

As for Rails/Ruby solutions… I don’t know of any off hand, but there is
a Perl interface to Geocoder.us which may be easily ported or referenced
to help you on your way:

http://search.cpan.org/~sderle/Geo-Coder-US-1.00/US.pm\

OR

http://maps.huge.info/

OR

http://www.zipcodedownload.com/

OR

http://www.zipinfo.com

OR

http://geocoder.us/
[http://www.hyperionreactor.net/node/87]


There is a 40,000+ zip code database in CivicSpace labs that has lat and
long by zip code:

http://civicspacelabs.org/home/developers/download

http://civicspacelabs.org/releases/zipcodes/zipcodes-csv-10-Aug-2004.zip

Distance between two points (lat, long) is calculated using the
Haversine formula:

dlon = lon2 - lon1
dlat = lat2 - lat1
a = (sin(dlat/2))2 + cos(lat1) * cos(lat2) * (sin(dlon/2))2
c = 2 * atan2(sqrt(a), sqrt(1-a))
d = R * c

See the following link or google for more:

http://mathforum.org/library/drmath/view/51879.html

Excellent! The only thing remaining is an efficient algorithm for a
search

for all zipcodes within a given radius.

Using Ruby and SQLite3:

pabs@halcyon:~/proj/zip> ./import.rb zipcode.{csv,db}
pabs@halcyon:~/proj/zip> ./find.rb zipcode.db 22003 3
“city”,“state”,“zip”,“distance (mi)”
“Annandale”,“VA”,“22003”,“0.0”
“Springfield”,“VA”,“22161”,“1.62363604423677”
“Springfield”,“VA”,“22151”,“1.87190097838136”
“Falls Church”,“VA”,“22042”,“2.97362028549975”

Here’s the code for each piece (also available at the URL
http://pablotron.org/files/zipfind.tar.gz):

---- import.rb ----
#!/usr/bin/env ruby

load libraries

require ‘rubygems’ rescue nil
require ‘sqlite3’

constants

SCAN_RE =
/"(\d{5})","([^"]+)","(…)","([\d.-]+)","([\d.-]+)","([\d-]+)","(\d)"/
SQL = “INSERT INTO zips(zip, city, state, lat, long, timezone, dst)
VALUES (?, ?, ?, ?, ?, ?, ?)”
TABLE_SCHEMA = "CREATE TABLE zips (
id INTEGER NOT NULL PRIMARY KEY,

 zip       VARCHAR(5)  NOT NULL,
 city      TEXT        NOT NULL,
 state     VARCHAR(2)  NOT NULL,
 lat       FLOAT       NOT NULL,
 long      FLOAT       NOT NULL,
 timezone  INTEGER     NOT NULL,
 dst       BOOLEAN     NOT NULL

);"

handle command-line arguments

unless ARGV.size == 2
$stderr.puts "Usage: #$0 "
exit -1
end
csv_path, db_path = ARGV

load database, create zip table and prepared statement

db = SQLite3::Database.new(db_path)
db.query(TABLE_SCHEMA)
st = db.prepare(SQL)

parse CSV and add each line to the database

db.transaction {
File.read(csv_path).scan(SCAN_RE).each { |row| st.execute(*row) }
}

---- find.rb ----
#!/usr/bin/env ruby

require ‘rubygems’
require ‘sqlite3’

MI_R = 1.15

grab base zip code

unless ARGV.size > 1
$stderr.puts “Usage: #$0 [radius]”
exit -1
end
db_path, src_zip, radius = ARGV
radius = (radius || 50).to_i

open database

db = SQLite3::Database.new(db_path)

get lat/long for specified zip code

sql = “SELECT lat, long FROM zips WHERE zip = ?”
src_lat, src_long = db.get_first_row(sql, src_zip).map { |v| v.to_f }

unless src_lat && src_long
$stderr.puts “Unknown zip code ‘#{src_zip}’”
exit -1
end

calculate min/max lat/long

ret, range = [], radius / 69.0

get all codes within given rectangle

sql = “SELECT lat, long, city, state, zip
FROM zips
WHERE lat > ? AND lat < ?
AND long > ? AND long < ?”
args = [src_lat - range, src_lat + range,
src_long - range, src_long + range]

db.prepare(sql).execute(*args).each do |row|
# get row values, convert lat/long to floats
dst_lat, dst_long, dst_zip, dst_city, dist_st = row
dst_lat, dst_long = dst_lat.to_f, dst_long.to_f

 # calculate distance between zip codes.  if dst_zip is within the
 # specified radius, then add it to the list of results
 d = Math.sqrt((dst_lat - src_lat) ** 2 + (dst_long - src_long) ** 
  1. ret << [dst_zip, dst_city, dist_st, d * 69.0] if d <= range
    end

    sort results by distance

    ret = ret.sort { |a, b| a[-1] <=> b[-1] }

    print out results as a CSV

    puts ‘“city”,“state”,“zip”,“distance (mi)”’,
    ret.map { |row| ‘"’ << row.join(’","’) << ‘"’ }

I suppose one technique might be to first narrow the databse search
within a

given a given square latitude/longitude range and then filter those
results

by testing that they are within the given circle radius

That’s all the code above does. There’s some room for optimization
there; for example, you could create a region field, then calculate list
of regions that intersect with the search radius. If you index on the
region field, then the query becomes essentially an index lookup instead
of a lat/long comparison (you still have to do the second distance
calculation, of course).