Current Temperature (#68)


Obviously, this quiz is about fetching data from the Internet. People
about this in two different ways. The most popular solution was to
“scrape” the
answer from popular Internet weather sites. Here is one such solution
David T.:

require 'uri'
require 'open-uri'
require 'rexml/document'

class Weather
  attr_reader :location, :temperature, :unit

  def initialize(zip_or_city, unit='f')
    raise "Error: Unit must be 'C' or 'F'." unless unit =~ /^[cf]$/i
    id = get_id(zip_or_city)
    url = "" +
    xml = open(url) { |f| }
    doc =
    @temperature = 

@unit = unit.upcase

  def get_id(location)
    location = URI.escape(location)
    url = "{location}"
    xml = open(url) { |f| }
    doc =
    locations = doc.elements.to_a("/search/loc")
    raise "Cannot find the location." if locations.size <= 0
    @location = locations[0].text.sub(/\s*\(\d+\)\s*$/, '')

if __FILE__ == $0
  if ARGV.size <= 0 || (ARGV[1] && ARGV[1] !~ /^[cf]$/i)
    puts "Usage:  #$0  city_or_zip_code  [c|f]"

    w =[0], ARGV[1] || 'f')
    puts "The temperature in #{w.location} is " +
         "#{w.temperature} degress #{w.unit}."
    puts "Information for #{ARGV[0]} was not found or unavailable."

Start at the bottom if statement. You can see in here that arguments
checked and a usage statement is printed, if needed. Next we can see
that a
Weather object is constructed and the temperature information is pulled
from its
methods. If anything goes wrong in this process, the rescue clause
prints a
suitable error message.

Now we need to check out the Weather object.

In initialize(), the zip_or_city is somehow turned into an “id”. Using
that id,
a url is constructed for an RSS feed of weather information for the
Next the terrific open-uri library is used to slurp the feed, just as we
would a
normal file. The xml is then dropped into Ruby’s standard REXML library
and an
XPath statement is used to extract the temperature from the feed.

The other method is the magic “id” conversion I mentioned earlier. It
works pretty similar to the method we just examined. Again a url is
constructed, the page slurped, the xml handed to REXML, and XPath used
to find a
matching location that was listed on the page. This time a Regexp is
needed to clean up the location name.

Unfortunately, scraping has some down sides. First, this use of REXML
Regexps to grab the information is pretty fragile. As soon as these
change in some way the author didn’t expect, the application is broken
and will
need maintenance.

The other big problem with scraping is legal issues. Many sites do not
access to their content like this (because it bypasses their ads).
Google is a
famous example of a company that does not allow scraping content. Be
sure you
check the usage policies of a site before you construct or use software

If you want to get around these issues, you can use a web service. Web
represent a predefined communication protocol. You pass the information
service expects in, and it will return a promised response. Here’s a
temperature solution using a web service by Rudolfs O.:

require 'soap/wsdlDriver'
require 'rexml/document'

URL = ''

# process the comandline arguments
if ARGV[0] == nil
  abort("Usage: weather.rb city")
  city = ARGV.join(' ')

soap =
  weather = soap.GetWeather(:CityName => city, :CountryName => "")

  # strip the first line with <? ?> stuff, else REXML wont parse
  xml = weather.getWeatherResult.gsub(/(<\?.*?>\n)/, '')
  data =

  # celsius degrees are in parentheses
  data.elements["//Temperature"].text[/\((.*)\)/]; temp = $1
  data.elements["//Location"].text[/^(.*),/]; loc = $1

  # show the gathered data
  puts "The temperature in " + loc + " is " + temp
  puts "Could not find data for your supplied city: " + city

Here again, we see that a URL is built, but this time the URL points to
document describing the web service (a Web Service Description Language,
WSDL, document). Another standard library, SOAP, is used to read and
parse that
document. In doing so, it will build a custom object that has the
provided by the service, accepting the arguments they expect.

The rest of the solution will look pretty familiar, since this service
an xml answer. Again REXML is used, with XPath, to find the data we are
interested in and again Regexps are used to clean it up for display. In
case we didn’t get too far away from the fragile scraping technique it
though web services typically see less change than web pages.

Not that the above solution doesn’t work with zip codes, because the
expects a city name. Supporting a zip code might be possible using
service to lookup the city name, if one could be found.

My thanks to all the submitters for another great quiz. Pulling
from the web is pretty common and these guys did a great job of showing
off how
painless it can be.

Tomorrow’s quiz is an easy, but unique, problem. If you have yet to
complete a
quiz solution, this one is for you…


Rudolfs O. solution is very sound and using a web-service is a good
decision for the reasons James mentioned. Bravo.


“Uncle D” removed_email_address@domain.invalid writes:

Rudolfs O. solution is very sound and using a web-service is a good
decision for the reasons James mentioned. Bravo.

Yet is the web-service used total crap. You should not be forced
to hack out XML declarations in real life.