Copying $1.. A loop with two regular expressions


#1

Hi, sorry if this is a stupid question… i’ve only been programming ruby
for about six hours.

I’m trying to white a loop to parse through a webpage and get all the
links to other pages. This loop depends on a regular expression to find
all the <a href tags… but inside the loop there is another regular
expression which looks to see if the link is relative or static. The
problem is the inner regular expression changes the $1 variable so the
loop just fails on the first iteration. I’ve tried making a copy of the
$1 variable but the result just ends up containing nil.

Any help you could offer would be gratefully appreciated

Hears my code so far:

	loop do
	 	url = $1
		puts $1    #A url
		puts $url  #Always nil ?

 		if $1 =~ /^http/  //Inner regular expression
			new_url = host + path
		else
			new_url = path
		end

		newPage = WebPage.new(new_url, link_depth + 1)

	break unless url =~ @@ahref_filter
	end

#2

Oky wrote:

Hi, sorry if this is a stupid question… i’ve only been programming ruby
for about six hours.

I’m trying to white a loop to parse through a webpage and get all the
links to other pages. This loop depends on a regular expression to find
all the <a href tags… but inside the loop there is another regular
expression which looks to see if the link is relative or static. The
problem is the inner regular expression changes the $1 variable so the
loop just fails on the first iteration. I’ve tried making a copy of the
$1 variable but the result just ends up containing nil.

Are you trying to use a command line argument? If so, try ARGV[1]
instead of $1 (which is a global variable storing the text of the
first subexpression in the most recent match.

Hal


#3

Hi Hal,

Thank you for your reply

No i’m not trying to use a command line argument, just use a loop which
uses two regular expression. After having a fresh look at the code i
spotted my mistake (url = $1 should have been url = $’) and it seams to
work now. Although I still don’t understand why I can’t copy the $1
variable (doller_one = $1 equals nil?) but it doesn’t matter to much now
as ruby lets you copy the $’ variable.

Thank very much for taking the time to reply

Oky


#4

Matthew D. wrote:

static. The problem is the inner regular expression changes the $1
puts $1 #A url

So try changing the “$url” to “url” and see what happens.

I hope it’s something good.

Matthew

Huh.

Try “You’ve assigned the value of $1 to the local variable url.”
That’ll make more sense (maybe).

Sheesh. Sorry. :wink:


#5

Hi there,

Oky wrote:

Any help you could offer would be gratefully appreciated

Hears my code so far:

	loop do
	 	url = $1
		puts $1    #A url
		puts $url  #Always nil ?

Here’s a problem for you to start with. You’ve assigned the value of
1 to the local variable url. When you puts $url, you are examining and
printing the global variable $url. Since you haven’t assigned anything
to $url yet, it is nil.

In Ruby an unadorned name like “url” is either a local variable or
method call. Putting a “$” on the front of something tells Ruby that
you want to refer to a global.

So try changing the “$url” to “url” and see what happens.

I hope it’s something good.

Matthew


#6

Oky wrote:

Thanks Matthew,

Your absolutely right. A schoolboy mistake from me :S I come from a
C++/Asm background and havenâ??t got the hang of all these undefined
variables yet.

Thanks again for your reply

No problem! I’m glad that I could help.

Have fun!


#7

Thanks Matthew,

Your absolutely right. A schoolboy mistake from me :S I come from a
C++/Asm background and havenâ??t got the hang of all these undefined
variables yet.

Thanks again for your reply