Nokogiri - Data output not as expected

Sorry for bothering the forum about this but it is extremely
important.

My Nokogiri output is being clobbered when being written to
excel, the output should be going row to row but its getting
overwritten.

Anybody know why? :frowning:

On Mon, Feb 4, 2013 at 4:39 PM, Barry K. [email protected]
wrote:

http://www.ruby-forum.com/attachment/8106/asus_nokogiri.rb


Posted via http://www.ruby-forum.com/.

require ‘nokogiri’
require ‘open-uri’
require ‘spreadsheet’

#Create the Spreadsheet
Spreadsheet.client_encoding = ‘UTF-8’
book = Spreadsheet::Workbook.new

sheet1 = book.create_worksheet
sheet1.name = ‘My First Worksheet’

#Specify our URI
%w[
http://www.asus.com/Notebooks_Ultrabooks/S56CA/#specifications
http://www.asus.com/Notebooks_Ultrabooks/ASUS_TAICHI_21/#specifications
].each do |url|

doc = Nokogiri::HTML(open(url))

  #Grab our product specifications
data = doc.css('div#specifications div#spec-area ul.product-spec 

li’)

#Modify our data
lines = data.map(&:text)

#Output our data  to the Spreadsheet
lines.each.with_index do |line, i|
a = 0
a+=1
  sheet1[i, a] = line
end

end

book.write ‘C:/Users/Barry/Desktop/output.xls’

I don’t know the spreadsheet API, but when you call this:

sheet1[i,a] = line

a is always 1, because the two previous lines:
a= 0
a += 1

always set a to 1. Maybe this means the first worksheet or whatever
and it’s fine, but in that case you should just write a 1:

sheet1[i, 1] = line

or at least have a more descriptive variable (or constant).

And, also, for every URL, the lines.each is executed again, with i
starting at 0. So every url will start at 0 and overwrite the previous
lines.

Jesus.

Nice explanations!

Thank you for your explanation Jesus but I am just not sure where to
reposition the code to fix this, sorry I am new to ruby.

On Mon, Feb 4, 2013 at 5:01 PM, Barry K. [email protected]
wrote:

Thank you for your explanation Jesus but I am just not sure where to
reposition the code to fix this, sorry I am new to ruby.

I don’t know the spreadsheet API, but if you need to pass the row
number, you will need to count how many lines you filled in in the
previous step. So create a variable added_lines = 0, outside of all
loops, and increment it with the number of lines after adding them to
the spreadsheet.
Also, change sheet[i,a] to sheet1[added_lines + i, a]

Jesus.

“Jesús Gabriel y Galán” [email protected] wrote in post
#1095159:

On Mon, Feb 4, 2013 at 5:01 PM, Barry K. [email protected]
wrote:

Thank you for your explanation Jesus but I am just not sure where to
reposition the code to fix this, sorry I am new to ruby.

I don’t know the spreadsheet API, but if you need to pass the row
number, you will need to count how many lines you filled in in the
previous step. So create a variable added_lines = 0, outside of all
loops, and increment it with the number of lines after adding them to
the spreadsheet.
Also, change sheet[i,a] to sheet1[added_lines + i, a]

Jesus.

Ok I made those changes but getting no output to excel, only blank with
no error code. :frowning:

Your a savious Jesus, been working on this all week with limited
success.

thank you!

On Mon, Feb 4, 2013 at 5:27 PM, Barry K. [email protected]
wrote:

“Jess Gabriel y Galn” [email protected] wrote in post
#1095159:

I don’t know the spreadsheet API, but if you need to pass the row
number, you will need to count how many lines you filled in in the
previous step. So create a variable added_lines = 0, outside of all
loops, and increment it with the number of lines after adding them to
the spreadsheet.
Also, change sheet[i,a] to sheet1[added_lines + i, a]

Ok I made those changes but getting no output to excel, only blank with
no error code. :frowning:

require ‘nokogiri’
require ‘open-uri’
require ‘spreadsheet’

#Create the Spreadsheet
Spreadsheet.client_encoding = ‘UTF-8’
book = Spreadsheet::Workbook.new

sheet1 = book.create_worksheet
sheet1.name = ‘My First Worksheet’

COLUMN = 1
added_lines = 0

#Specify our URI
%w[
http://www.asus.com/Notebooks_Ultrabooks/S56CA/#specifications
].each do |url|

doc = Nokogiri::HTML(open(url))

  #Grab our product specifications
data = doc.css('div#specifications div#spec-area ul.product-spec 

li’)

#Modify our data
lines = data.map(&:text)

#Output our data  to the Spreadsheet
lines.each.with_index do |line, i|
    sheet1[added_lines + i, COLUMN] = line
end
            added_lines += lines.size

end

book.write ‘C:/Users/Barry/Desktop/output.xls’

Hope this helps,

Jesus.

Alternative:

require ‘nokogiri’
require ‘open-uri’
require ‘spreadsheet’

#Create the Spreadsheet
Spreadsheet.client_encoding = ‘UTF-8’
book = Spreadsheet::Workbook.new

sheet1 = book.create_worksheet
sheet1.name = ‘My First Worksheet’

lines = []

#Specify our URI
%w[
http://www.asus.com/Notebooks_Ultrabooks/S56CA/#sp
].each do |url|

doc = Nokogiri::HTML(open(url))

lines << doc.css(‘div#specifications div#spec-area ul.product-spec
li’).map(&:text)

end

lines.flatten!

lines.each_with_index do |line, idx|
sheet1.row(idx).push line
end

book.write(‘C:/Users/Barry/Desktop/output.xls’)

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs