Please consider the following code:
require ‘net/http’
Net::HTTP.start(‘weather.gmdss.org’) do |http|
response = http.get(‘/III.html’)
response.body.scan(/<a.*a>/) {|link| puts “#{link}\n\n”}
end
I expect to have each html link printed separately but this is true only
for the first three. The others are grouped together in two group.
This is what I get.
![WMO]()
![MF]()
![JCOMM]()
HOME PAGE
METAREA I
METAREA
II
METAREA III
[cut]
[cut]
Any help will be really appreciated.
Bruno
response.body.scan(/<a.*?a>/)
(Note the question mark.)
Read up on greedy versus non-greedy matching in regular expressions.
Read up on greedy versus non-greedy matching in regular expressions.
Thanks !
No need to say that I’m a newbie. I’m coding with the pickaxe manual on
my
side and yes … I miss the point: sorry.
But why did it work for the first three occurrences ?
Bruno
On Dec 2, 2005, at 0:10, Bruno Bazzani wrote:
Read up on greedy versus non-greedy matching in regular expressions.
Thanks !
No need to say that I’m a newbie. I’m coding with the pickaxe
manual on my side and yes … I miss the point: sorry.
But why did it work for the first three occurrences ?
Because they’re on their own line to begin with, and regular
expressions (by default) work on a line-by-line basis.
You will probably also want to include the multiline option in your
expression, otherwise you’ll fail in situations like this as well:
un-necessary whitespace
matthew smillie.