How to use an array to load html data with ruby

there is a html file:

report_data liu_asset cash finance_asset note
2009-12-31 0 1,693,048,000,000 20,147,000,000 500
2009-09-30 0 1,777,512,000,000 24,977,000,000 700

how can i use an array to load data with ruby,here is what i want :
array[0,0]=report_data
array[0,1]=liu_asset
array[0,2]=cash
array[0,3]=finance_asset
array[0,4]=note
array[1,0]=2009-12-31
array[1,1]=0
array[1,2]=1,693,048,000,000
array[1,3]=20,147,000,000
array[1,4]=500
array[2,0]=2009-09-30
array[2,1]=0
array[2,2]=1,777,512,000,000
array[2,3]=24,977,000,000
array[2,4]=700

On Mon, Apr 5, 2010 at 10:33 PM, Pen T. [email protected] wrote:

2009-12-31 24,977,000,000 array[1,0]=2009-12-31 Posted via http://www.ruby-forum.com/.

You’re missing a after the first row.

I added it in for you, and gave an example of how you can use Hpricot to
load the data. Here is basically everything I know about Hpricot: you
can
get the text inside an Hpricot element by passing it #innerHTML, if it
is an
element, you can get a list of elements with a specific tag by using
division, for example how I got all the td’s from the row. You can get
the
first instance of a specific tag by using %, for example how I pull out
the
strong.
That is all I know about Hpricot, but it’s all a problem like this
requires.

require ‘rubygems’
require ‘hpricot’

rows = Array.new
for row in Hpricot(DATA) % ‘table’ / ‘tr’
rows.push Array.new
for data in row / ‘td’
rows.last.push ( data % ‘strong’ || data ).innerHTML
end
end

require ‘pp’
pp rows

END

report_data liu_asset cash finance_asset note
2009-12-31 0 1,693,048,000,000 20,147,000,000 500
2009-09-30 0 1,777,512,000,000 24,977,000,000 700

Here’s a similar way to do what Josh does, only using Nokogiri:

#!/usr/bin/ruby

require ‘rubygems’
require ‘nokogiri’

HTML =<<EOT

report_data liu_asset cash finance_asset note
2009-12-31 0 1,693,048,000,000 20,147,000,000 500
2009-09-30 0 1,777,512,000,000 24,977,000,000 700
EOT

doc = Nokogiri::HTML.parse(HTML)

array = []
doc.css(‘tr’).each_with_index do |tr, tr_i|
array[tr_i] = tr.css(‘td’).map{ |td| td.text }
end

array[0][0] # => “report_data”
array[0][1] # => “liu_asset”
array[0][2] # => " cash"
array[0][3] # => “finance_asset”
array[0][4] # => " note"
array[1][0] # => “2009-12-31”
array[1][1] # => "0 "
array[1][2] # => “1,693,048,000,000”
array[1][3] # => “20,147,000,000”
array[1][4] # => “500”
array[2][0] # => “2009-09-30”
array[2][1] # => "0 "
array[2][2] # => “1,777,512,000,000”
array[2][3] # => “24,977,000,000”
array[2][4] # => “700”

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs