Parsing CSV file into a database


#1

How would i go about parsing a csv from the web eg yahoo finance stock
data into my database table called stocks?

I would like this process to happen each time a user logs into there
account so it will update the table with the latest data possible from
the csv file.


#2

On Mon, Feb 16, 2009 at 3:25 PM, Nick H. removed_email_address@domain.invalid
wrote:

How would i go about parsing a csv from the web eg yahoo finance stock
data into my database table called stocks?

If you have a url to download the csv, you can read it with something
like open-uri.
Then use a CSV library like FasterCSV for csv parsing, and then I
would use a ORM
(Sequel, ActiveRecord or Datamapper come to mind) to create the rows
in your table.

Hope this helps,

Jesus.


#3

ok thanks thats great, thanks for the quick response. Have you any
sample code or examples that you could possible provide me?

Thanks for your time

Nick

Jesús Gabriel y Galán wrote:

On Mon, Feb 16, 2009 at 3:25 PM, Nick H. removed_email_address@domain.invalid
wrote:

How would i go about parsing a csv from the web eg yahoo finance stock
data into my database table called stocks?

If you have a url to download the csv, you can read it with something
like open-uri.
Then use a CSV library like FasterCSV for csv parsing, and then I
would use a ORM
(Sequel, ActiveRecord or Datamapper come to mind) to create the rows
in your table.

Hope this helps,

Jesus.


#4

On Feb 16, 2009, at 9:57 AM, Nick H. wrote:

ok thanks thats great, thanks for the quick response. Have you any
sample code or examples that you could possible provide me?

It’s probably as simple as:

require “open-uri”
require “rubygems”
require “faster_csv”
open(“url goes here”) do |csv|
FCSV.new(csv).each do |row|
# load row into database here…
end
end

Give it a shot and come back with specific questions when you get stuck.

James Edward G. II


#5

Ok great! will do

Nick

James G. wrote:

On Feb 16, 2009, at 9:57 AM, Nick H. wrote:

ok thanks thats great, thanks for the quick response. Have you any
sample code or examples that you could possible provide me?

It’s probably as simple as:

require “open-uri”
require “rubygems”
require “faster_csv”
open(“url goes here”) do |csv|
FCSV.new(csv).each do |row|
# load row into database here…
end
end

Give it a shot and come back with specific questions when you get stuck.

James Edward G. II


#6

On Feb 16, 5:39 pm, Nick H. removed_email_address@domain.invalid wrote:

sample code or examples that you could possible provide me?
end

Give it a shot and come back with specific questions when you get stuck.

James Edward G. II


Posted viahttp://www.ruby-forum.com/.

Here is a sample for downloading historical data from Yahoo:

def get_historical_data(symbol, startDateString, endDateString)
startDate = Date.parse(startDateString)
endDate = Date.parse(endDateString)
query = “/table.csv?s=#{symbol}&g=d” +
“&a=#{startDate.month-1}&b=#{startDate.mday}&c=#
{startDate.year}” +
“&d=#{endDate.month-1}&e=#{endDate.mday}&f=#
{endDate.year.to_s}”
Net::HTTP.start(“itable.finance.yahoo.com”, 80) { |http|
res = http.get(query)
res.body
}
end


#7

Bosko I. wrote:

On Feb 16, 5:39�pm, Nick H. removed_email_address@domain.invalid wrote:

sample code or examples that you could possible provide me?
� �end

Give it a shot and come back with specific questions when you get stuck.

James Edward G. II


Posted viahttp://www.ruby-forum.com/.

Here is a sample for downloading historical data from Yahoo:

def get_historical_data(symbol, startDateString, endDateString)
startDate = Date.parse(startDateString)
endDate = Date.parse(endDateString)
query = “/table.csv?s=#{symbol}&g=d” +
“&a=#{startDate.month-1}&b=#{startDate.mday}&c=#
{startDate.year}” +
“&d=#{endDate.month-1}&e=#{endDate.mday}&f=#
{endDate.year.to_s}”
Net::HTTP.start(“itable.finance.yahoo.com”, 80) { |http|
res = http.get(query)
res.body
}
end

require ‘date’
require ‘net/http’

def get_historical_data(symbol, startDateString, endDateString)
startDate = Date.parse(startDateString)
endDate = Date.parse(endDateString)
query = “/table.csv?s=#{symbol}&g=d” +
“&a=#{startDate.month-1}&b=#{startDate.mday}&c=#
{startDate.year}” +
“&d=#{endDate.month-1}&e=#{endDate.mday}&f=#
{endDate.year.to_s}”
Net::HTTP.start(“itable.finance.yahoo.com”, 80) { |http|
res = http.get(query)
res.body
}
end

p get_historical_data(“AAPL”, “12/28/2008”, “1/7/2009”)

–output:–
/usr/lib/ruby/1.8/net/http.rb:1556:in read_status_line': wrong status line: "Date,Open,High,Low,Close,Volume,Adj Close" (Net::HTTPBadResponse) from /usr/lib/ruby/1.8/net/http.rb:1538:inread_new’
from /usr/lib/ruby/1.8/net/http.rb:833:in request' from /usr/lib/ruby/1.8/net/http.rb:615:inget’
from r1test.rb:13:in get_historical_data' from r1test.rb:12:instart’
from /usr/lib/ruby/1.8/net/http.rb:324:in start' from r1test.rb:12:inget_historical_data’
from r1test.rb:18


#8

Bosko I. wrote:

Probably cut-paste caused error. Here is my slightly rewritten script
and output:

I get the same error:

require ‘net/http’
require ‘date’

def get_historical_data(symbol, startDateString, endDateString)
startDate = Date.parse(startDateString)
endDate = Date.parse(endDateString)
query = “/table.csv?s=#{symbol}&g=d”
query.concat("&a=#{startDate.month-1}&b=#{startDate.mday}&c=#
{startDate.year}")
query.concat("&d=#{endDate.month-1}&e=#{endDate.mday}&f=#
{endDate.year.to_s}")
Net::HTTP.start(“itable.finance.yahoo.com”, 80) { |http|
res = http.get(query)
res.body
}
end

puts get_historical_data(“MSFT”, “12/28/2008”, “1/7/2009”)

–output:–

$ ruby r1test.rb
/usr/lib/ruby/1.8/net/http.rb:1556:in read_status_line': wrong status line: "Date,Open,High,Low,Close,Volume,Adj Close" (Net::HTTPBadResponse) from /usr/lib/ruby/1.8/net/http.rb:1538:inread_new’
from /usr/lib/ruby/1.8/net/http.rb:833:in request' from /usr/lib/ruby/1.8/net/http.rb:615:inget’
from r1test.rb:13:in get_historical_data' from r1test.rb:12:instart’
from /usr/lib/ruby/1.8/net/http.rb:324:in start' from r1test.rb:12:inget_historical_data’
from r1test.rb:18


#9

7stud – wrote:

Bosko I. wrote:

Probably cut-paste caused error. Here is my slightly rewritten script
and output:

I get the same error:

I hope this resolves the issue:
These 2 lines should be on one line in your editor:
query.concat("&a=#{startDate.month-1}&b=#{startDate.mday}&c=#
{startDate.year}")

And the same is true for the next “query.concat etcetera” lines.

hth,

Siep


#10

On Feb 16, 9:34 pm, 7stud – removed_email_address@domain.invalid wrote:

Posted viahttp://www.ruby-forum.com/.
{endDate.year.to_s}"
startDate = Date.parse(startDateString)
end
from r1test.rb:12:in start' from /usr/lib/ruby/1.8/net/http.rb:324:instart’
from r1test.rb:12:in `get_historical_data’
from r1test.rb:18


Posted viahttp://www.ruby-forum.com/.

Probably cut-paste caused error. Here is my slightly rewritten script
and output:

require ‘net/http’
require ‘date’

def get_historical_data(symbol, startDateString, endDateString)
startDate = Date.parse(startDateString)
endDate = Date.parse(endDateString)
query = “/table.csv?s=#{symbol}&g=d”
query.concat("&a=#{startDate.month-1}&b=#{startDate.mday}&c=#
{startDate.year}")
query.concat("&d=#{endDate.month-1}&e=#{endDate.mday}&f=#
{endDate.year.to_s}")
Net::HTTP.start(“itable.finance.yahoo.com”, 80) { |http|
res = http.get(query)
res.body
}
end

puts get_historical_data(“MSFT”, “12/28/2008”, “1/7/2009”)

Date,Open,High,Low,Close,Volume,Adj Close
2009-01-07,20.19,20.29,19.48,19.51,72709900,19.51
2009-01-06,20.75,21.00,20.61,20.76,58083400,20.76
2009-01-05,20.20,20.67,20.06,20.52,61475200,20.52
2009-01-02,19.53,20.40,19.37,20.33,50084000,20.33
2008-12-31,19.31,19.68,19.27,19.44,46419000,19.44
2008-12-30,19.01,19.49,19.00,19.34,43224100,19.34
2008-12-29,19.15,19.21,18.64,18.96,58512800,18.96