Forum: Ruby Ruby Mechanize and do-if loops

Posted by Sybren Kooistra (sybrenkooistra)
on 2012-10-02 15:19
Hi all,

This is probably pretty basic stuff, but I can't seem to manage to get
my head 'round it:

I'm going through a number of webpages using mechanize. It is a
database, with many pages and 10 links on each page. I save the first 10
links into a txt file, continue to the next page, and save those 10.
Etcetera.

However, how do I determine it to to automatically stop when there is no
'next page' (link_with(:text => 'Next >') ?

Code:
require 'nokogiri'
require 'open-uri'
require 'mechanize'
agent = Mechanize.new
agent.get("http://www.miljoenhuizen.nl/houses/nl/1000/0/99999...)

# get all links on the page and save to txt file
pagina = agent.page
links = pagina.search("//td[@class = 'upper']/a")
open('pc1000.txt', 'a') { |f| f.puts links }

#open next page and save to textfile
agent.page.link_with(:text => 'Next >').click
open('pc1000.txt', 'a') { |f| f.puts links }

#etcetera, but how do make it loop and end untill there is no longer a
'Next >' link? to click ???!
Posted by unknown (Guest)
on 2012-10-02 15:55
(Received via mailing list)
Am 02.10.2012 15:19, schrieb Sybren Kooistra:
> However, how do I determine it to to automatically stop when there is no
> pagina = agent.page
> links = pagina.search("//td[@class = 'upper']/a")
> open('pc1000.txt', 'a') { |f| f.puts links }
>
> #open next page and save to textfile
> agent.page.link_with(:text => 'Next >').click
> open('pc1000.txt', 'a') { |f| f.puts links }
>
> #etcetera, but how do make it loop and end untill there is no longer a
> 'Next >' link? to click ???!
>

link_with(:text => 'some text') returns nil when there is no such link.
You could use that in a loop:

loop do
   # do what you want to do with each page

   next_link = agent.page.link_with(:text => 'Next >')
   break  unless next_link
   next_link.click
end
Posted by Sybren Kooistra (sybrenkooistra)
on 2012-10-02 16:26
Thanks! Should work..

So why is not working for me?

agent = Mechanize.new
agent.get("http://www.miljoenhuizen.nl/houses/nl/1000/0/99999...)
loop do
pagina = agent.page
links = pagina.search("//td[@class = 'upper']/a")
open('postcode.txt', 'a') { |f| f.puts links }
next_link = agent.page.link.with(:text => 'Next >')
break unless next_link
next_link.click
end

??
Posted by Sybren Kooistra (sybrenkooistra)
on 2012-10-02 17:03
Never mind - had to do with Loop (capital L) in stead of loop.. sigh.

Thanks again! :)
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.