1st program -- how would you improve this?

Hi,

I just wrote my first Ruby program and I would be very interested in
some feedback as to style and how to write things more in the
Ruby-way.

I know I could use more methods and/or possibly classes, and that
will happen, but for now I would like to hear what betrays my
programming style coming from other languages.

Thanks,
Esmail

Briefly, this tiny program grabs a HTML for a random date within the
last 10 years from the Astro Picture of the Day (APOD) site
(Astronomy Picture of the Day Archive 2015).

It then parses the HTML file to extract the name of the larger .jpg or
.gif file (there is usually a smaller version too), generates the
correct URL for the image, and then fires up eog (a Linux program to
display images) to fetch and display the image.

Works … but it’s not pretty, at least in terms of lack of methods
etc. Other ways to improve this?


#!/usr/bin/ruby
require ‘net/http’

method found on-line, slightly modified

def random_date(years_back=5)
year = Time.now.year - rand(years_back) - 1
month = rand(12) + 1
day = rand(31) + 1
year = year.to_s.gsub(/20|19/,‘’)
sprintf(“%02s%02d%02d”, year,month,day)
end

r_date=random_date(10) # go back 10 years

puts r_date

get the html file from the apod site

text=Net::HTTP.start( ‘antwrp.gsfc.nasa.gov’, 80 ) do |http|
http.get( “/apod/ap#{r_date}.html” ).body
end

get the HTML and process contents line by line

found_title = false
found_image = false

text.each { |line|

find the title of the current pic

if !found_title && line =~ //i
title = $’

puts “----------------------------------------------”
printf(“%s\n”, title.strip)
puts “----------------------------------------------\n\n”

found_title = true
end

now start searching for the image name

if !found_image && line =~ /.(jpg|gif) *">/i

if line.include? “jpg” # is there a way to case do insensitve
comparison here?
image = $+".jpg" else image = $+“.gif”
end

url=“Astronomy Picture of the Day
image=image.gsub(/<a +href="/i,‘’)
fetch_url=url+image

printf(“%s\n”, fetch_url)

found_image = true
system(“eog " + fetch_url + " >& /dev/null &”)
end

}

On 10/5/06, Esmail B. [email protected] wrote:

Thanks,
Esmail

Works … but it’s not pretty, at least in terms of lack of methods
etc. Other ways to improve this?

Hi Esmail,

Here’s my shot at rewriting your program.
Note that this is simply how I would write it, not everything changed
is necessarily an ‘improvement’.

require ‘open-uri’
def random_date(years_back=5)
Time.at(Time.now - rand(years_back * 365 * 24 *
3600)).strftime(‘%y%m%d’)
end

r_date=random_date(10)
puts r_date

open(“http://antwrp.gsfc.nasa.gov/apod/ap#{r_date}.html”) {|f|
text = f.read
if text =~ /([^<]*)/i
puts “----------------------------------------------”
puts $1.strip
puts “----------------------------------------------\n\n”
end

if text =~ /href="([^"]+(jpg|gif))">/i
image = $1
fetch_url = “Astronomy Picture of the Day” + image
puts fetch_url
# system(…)
end
}

Comments:

  • your random_date generated invalid dates sometimes, 01/30/02 and
    such. Using timestamps for everything is a bit more stable (Although
    slightly off if lots of leap years are in the range).
  • puts does to_s automatically, so printf(“%s\n”,…) is never needed
  • reading the text per line was meant as an optimization? This is
    probably not necessary.
  • $’ / $` and such are rarely used in Ruby, try to avoid them when the
    more readable $1 or String#scan will do.
  • more methods / classes are not really needed for such a small program.

I’d probably have used OpenURI and then

text = open('http://www.example.com/').read

which I find a little more readable.

Oh, and probably also a HTML parser like Hpricot to make getting the
title tag more foolproof.

I also strongly disapprove of the Perl-ish use of global variables that
get set as a side-effect of a regex match. Notably because that’s a)
relying on a side-effect of a function on b) a global variable. But
coding everything explicitly is a personal pet peeve of mine.

David V.

Esmail B. wrote:

Thanks,
correct URL for the image, and then fires up eog (a Linux program to
display images) to fetch and display the image.

Works … but it’s not pretty, at least in terms of lack of methods
etc. Other ways to improve this?

<>

You could use Hpricot to do this much more easily and reliably. I’ve
whipped up an example of using it for this. Note also that your random
dat function could produce a date in the future up to the end of the
current year. Error checking when using any of the network libraries is
also a must.

For a first program that’s pretty good though! You obviously have the
hang of using Ruby, you just need to pick up on a few of the libraries
out there and how best to use them.

Here’s how I would do it:

Remove the rubygems require if you manually installed hpricot

require ‘rubygems’
require ‘hpricot’
require ‘net/http’

class Time
def self.random(years_back = 10)
# Set start and end times
end_time = Time.now
start_time = Time.mktime(end_time.year - 10, end_time.month,
end_time.day, 0, 0,0,0)

Time.at(start_time + rand(end_time - start_time))

end
end

Select a date

r_date = Time.random.strftime(“%y%m%d”)
puts r_date

Retrieve the page

response = Net::HTTP.get_response(“antwrp.gsfc.nasa.gov”,
“/apod/ap#{r_date}.html”)
unless (200…299) === response.code.to_i
puts “There was an error retrieving the document”
puts “Response code: #{response.code}”
exit
end

Parse the document

doc = Hpricot(response.body)

Extract title

title = (doc / “title”).first.inner_html
puts “Title: #{title}”

Extact all images

images = (doc/“img”)
if images.length == 0
puts “Could not extract the image from the document.”
exit
end

Assume the first image is the one we want.

image = images.first[“src”]

puts “Running eog (whatever that is) to load image #{image}.”

Execute the program. Note that it depends on the image path being

relative.
system(“eog Astronomy Picture of the Day”)

On Oct 5, 2006, at 5:49 PM, David V. wrote:

I also strongly disapprove of the Perl-ish use of global variables
that
get set as a side-effect of a regex match. Notably because that’s a)
relying on a side-effect of a function on b) a global variable. But
coding everything explicitly is a personal pet peeve of mine.

Just because they start with a $ doesn’t mean they are global. I
believe
$1, $2 and so on are actually per thread local variables.

Gary W.

David V. wrote:

I’d probably have used OpenURI and then

text = open(‘http://www.example.com/’).read
which I find a little more readable.

ah … cool … I got what I had before from some sample
code I found either in a book or on line. Lots of ways
to do things, hence my query, and goal to find out :slight_smile:

Oh, and probably also a HTML parser like Hpricot to make getting the
title tag more foolproof.

Some one else had mentioned Hpricot … but that’s an add-on, right?
Ie not part of the standard Ruby distribution? My goal is to write
code that will run pretty much everywhere w/o special requirements,
but I will look into Hpricot since it seems very useful.

I also strongly disapprove of the Perl-ish use of global variables that
get set as a side-effect of a regex match. Notably because that’s a)
relying on a side-effect of a function on b) a global variable. But
coding everything explicitly is a personal pet peeve of mine.

Oh, I’m all for explicitly coding too … (ie don’t rely on default
initial values for variables, always explicitly initialize them myself

  • good habit in case you work with a language that doesn’t do that for
    you, like C. or use parens etc.)

So, by globals, are you referring to the $` and $’ ?

David V. wrote:

I also strongly disapprove of the Perl-ish use of global variables that
get set as a side-effect of a regex match. Notably because that’s a)
relying on a side-effect of a function on b) a global variable. But
coding everything explicitly is a personal pet peeve of mine.

Hi David,

Ruby embraces the side-effect. This doesn’t mean unpredictability as
many functional programmers assume; it means that the conditions which
always cause the same side-effect should be incorporated into the
control flow, leading to the same output from the same input, always.

s = ‘tree’
if s =~ /(.ree)/
p $1
end

=> “tree”

if s =~ /(free)/
p $1
end
p $1 # => nil

$1 will always be non-nil after a grouped match, and nil after a
non-match or an ungrouped match. So the side-effect is fully
predictable.

Regarding global variables; I don’t see anything wrong with them in
concept
; granted you don’t always need them, and you can explicitly
pass around a local variable (or use an object attribute like an
instance variable) to get the same effect in most cases, but that is
the same thing conceptually as using the global scope to implicity
pass the value. Also, granted that you should not clutter up the global
namespace with a bunch of junk, especially not vast quantities of
large, short-lived data; but the result of a regexp match which is
released after the next match attempt doesn’t seem to be in that
category.

But those issues aside, I also favor being explicit for the sake of
clarity.

m = /(.ree)/.match(‘tree’)
if m and m[1]
p m[1]
end

But even then, $1 still gets set. :wink:

Regards,
Jordan

[email protected] wrote:

Just because they start with a $ doesn’t mean they are global. I believe
$1, $2 and so on are actually per thread local variables.

There’s still the readability problem with that. You’re communicating
information in the program via an implicitly assumed point instead of an
explicitly declared one.

David V.

Sander L. wrote:

Here’s my shot at rewriting your program.
Note that this is simply how I would write it, not everything changed
is necessarily an ‘improvement’.

Hi

I do appreciate the feedback!

require ‘open-uri’
def random_date(years_back=5)
Time.at(Time.now - rand(years_back * 365 * 24 * 3600)).strftime(‘%y%m%d’)
end

ah yes, I thought about that after I posted this. Didn’t know about
strftime, hence my C sprintf standby :slight_smile: I may slightly adjust this
to start with a date in 1995 and work forward to the present.

if text =~ /href="([^"]+(jpg|gif))">/i
image = $1
fetch_url = “Astronomy Picture of the Day” + image
puts fetch_url

system(…)

end
}

nice …

Comments:

  • your random_date generated invalid dates sometimes, 01/30/02 and
    such. Using timestamps for everything is a bit more stable (Although
    slightly off if lots of leap years are in the range).
  • puts does to_s automatically, so printf(“%s\n”,…) is never needed
  • reading the text per line was meant as an optimization? This is
    probably not necessary.

no, not really, just thought the string matching would work
better with line-by-line.

  • $’ / $` and such are rarely used in Ruby, try to avoid them when the
    more readable $1 or String#scan will do.
  • more methods / classes are not really needed for such a small program.

Thanks for taking time to make these suggestions/comments, they are
very useful and just exactly what I was looking for.

Esmail

Timothy G. wrote:

You could use Hpricot to do this much more easily and reliably. I’ve
whipped up an example of using it for this. Note also that your random
dat function could produce a date in the future up to the end of the
current year. Error checking when using any of the network libraries is
also a must.

Agreed … more error checking needs to be done … as you mention
below, I need to browse the libraries and find out what is available
and use it :slight_smile:

For a first program that’s pretty good though! You obviously have the
hang of using Ruby, you just need to pick up on a few of the libraries
out there and how best to use them.

I’ve been programming for quite a while, but each language has
its own idiomatic ways for doing things … which is why this
feedback is so valuable.

end_time = Time.now

doc = Hpricot(response.body)
end

Assume the first image is the one we want.

image = images.first[“src”]

puts “Running eog (whatever that is) to load image #{image}.”

Execute the program. Note that it depends on the image path being

relative.
system(“eog Astronomy Picture of the Day”)

Thank you!

Esmail