Hi all,
This is probably pretty basic stuff, but I can't seem to manage to get
my head 'round it:
I'm going through a number of webpages using mechanize. It is a
database, with many pages and 10 links on each page. I save the first 10
links into a txt file, continue to the next page, and save those 10.
Etcetera.
However, how do I determine it to to automatically stop when there is no
'next page' (link_with(:text => 'Next >') ?
Code:
require 'nokogiri'
require 'open-uri'
require 'mechanize'
agent = Mechanize.new
agent.get("http://www.miljoenhuizen.nl/houses/nl/1000/0/99999...)
# get all links on the page and save to txt file
pagina = agent.page
links = pagina.search("//td[@class = 'upper']/a")
open('pc1000.txt', 'a') { |f| f.puts links }
#open next page and save to textfile
agent.page.link_with(:text => 'Next >').click
open('pc1000.txt', 'a') { |f| f.puts links }
#etcetera, but how do make it loop and end untill there is no longer a
'Next >' link? to click ???!
on 2012-10-02 15:19
on 2012-10-02 15:55
Am 02.10.2012 15:19, schrieb Sybren Kooistra: > However, how do I determine it to to automatically stop when there is no > pagina = agent.page > links = pagina.search("//td[@class = 'upper']/a") > open('pc1000.txt', 'a') { |f| f.puts links } > > #open next page and save to textfile > agent.page.link_with(:text => 'Next >').click > open('pc1000.txt', 'a') { |f| f.puts links } > > #etcetera, but how do make it loop and end untill there is no longer a > 'Next >' link? to click ???! > link_with(:text => 'some text') returns nil when there is no such link. You could use that in a loop: loop do # do what you want to do with each page next_link = agent.page.link_with(:text => 'Next >') break unless next_link next_link.click end
on 2012-10-02 16:26
Thanks! Should work..
So why is not working for me?
agent = Mechanize.new
agent.get("http://www.miljoenhuizen.nl/houses/nl/1000/0/99999...)
loop do
pagina = agent.page
links = pagina.search("//td[@class = 'upper']/a")
open('postcode.txt', 'a') { |f| f.puts links }
next_link = agent.page.link.with(:text => 'Next >')
break unless next_link
next_link.click
end
??
on 2012-10-02 17:03
Never mind - had to do with Loop (capital L) in stead of loop.. sigh. Thanks again! :)
Please log in before posting. Registration is free and takes only a minute.
Existing account
(Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
Log in with Google account | Log in with Yahoo account
No account? Register here.