I am new to Ruby and Nokogiri so excuse my lack of knowledge. I am
wondering is it possible to scrape more than one url in Nokogiri and
output it line by line to excel?
My code scanning a single url is attached.
I am new to Ruby and Nokogiri so excuse my lack of knowledge. I am
wondering is it possible to scrape more than one url in Nokogiri and
output it line by line to excel?
My code scanning a single url is attached.
I haven’t tested this code, it’s just a flow example. There might be
some array flattening or something you’d need to do.
require ‘nokogiri’
require ‘open-uri’
require ‘spreadsheet’
my_excel_output = []
my_array =[
[my_url1, css1],
[my_url2, css2]
]
my_array.each do |my_url, my_css|
doc = Nokogiri::HTML(open(my_url))
lines = doc.css(my_css).map(&:text)
my_excel_output << lines
end
#Drop all the data into your spreadsheet.
Thank you for your reply however I am getting the following error:
C:/Users/barry/Desktop/tester4.rb:16:in block in <main>': can't convert Array into String (TypeError) from C:/Users/barry/Desktop/tester4.rb:11:in
each’
from C:/Users/barry/Desktop/tester4.rb:11:in `’
[Finished in 2.3s with exit code 1]
require ‘nokogiri’
require ‘open-uri’
require ‘spreadsheet’
my_excel_output = []
my_array =[
[“http://www.asus.com/Notebooks_Ultrabooks/ASUS_TAICHI_21/#specifications”,
‘div#specifications div#spec-area ul.product-spec li’],
[“http://www.asus.com/Notebooks_Ultrabooks/S56CA/ #specifications”,
‘div#specifications div#spec-area ul.product-spec li’]
]
my_array.each do |my_url, my_css|
doc = Nokogiri::HTML(open(my_url))
lines = doc.css(my_css).map(&:text)
‘C:/Users/Barry/Desktop/output2.xls’ << lines
end
You’re trying to append the data you’ve scraped (an array) onto a string
(your filename).
The idea behind my example (of untested code) is that the array
“my_excel_output” will contain all the lines from all of you scraping.
Then after the loop is complete, you can put that code into your excel
worksheet all at once. You might need to call “my_excel_output.flatten!”
first if the arrays are nested too deep.
Your output code still needs to be run, I just didn’t include it in my
answer because it would be repeating what you’ve already written.
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.
Sponsor our Newsletter | Privacy Policy | Terms of Service | Remote Ruby Jobs