First let me say that I am an absolute Newbie to Ruby. So please be
tolerant of my newbie question.
My situation is this. I am gathering financial data, and am about to
change data suppliers. I want to “merge” the files from both suppliers
to have as much data history as possible. I have the data in ASCII
format in a comma delimited file.
I have the data in the following structre:
c:\data\1Original\abc.csv - a new data file
c:\data\2Processed\abc.csb - the historical file and my processing
reference
Each file has the same file structure of:
Symbol, Date, Open, High, Low, Close, Volume
I already have a process that references the files in the
c:\data\processed directory structure.
Currently I have figured out how to walk the directory tree and copy any
NEW files into the Processed directory. I am hung up on the merging of
the files into the processed directory.
Sample files to demonstrate:
c:\data\1Original\abc.csv (new data)
abc, 20060901, 1.5, 2.1, 1.4, 1.9, 123456
abc, 20060902. 1.9, 2.3, 1.8, 2.3, 147454
c:\data\2Processed\abc.csv (historical)
abc, 20010101, 2.1, 2.5, 2.0, 2.45, 254677
abc, 20010102. 2.4, 2.6, 2.4, 2.5, 333444
…
abc, 20060901, 1.5, 2.1, 1.4, 1.9, 123456
I need to create
c:\data\2Processed\abc.csv (historical)
abc, 20010101, 2.1, 2.5, 2.0, 2.45, 254677
abc, 20010102. 2.4, 2.6, 2.4, 2.5, 333444
…
abc, 20060901, 1.5, 2.1, 1.4, 1.9, 123456
abc, 20060902. 1.9, 2.3, 1.8, 2.3, 147454
So, I am with how to read the files in and merge.
Here is my thought process:
- Read the files into arrays (of rows)
- Check the dates of the rows
- Output the early dates from the historical file
- Output the common data from either file (probably historical as
already in it) - Output new data from new file
So, the code I have so far is this…
puts ‘start’
require ‘find’
require ‘ftools’
dir1original = ‘c:/Data/1Original/’
dir2processed = ‘c:/Data/2Processed/’
puts ‘Here’
Find.find(dir1original) { |path| puts path}
Find.find(dir1original) do |path|
puts ‘The current item is ’ + path
if File.file? path
puts path + ’ is a file’
end
end
puts ‘create log files’
Set up Log files and Specific output files
runlogfile = ‘c:/Data/runlog.txt’
open(runlogfile, “w”) { |f| f << “Runlog of StepOneIncrement\n”}
puts ‘Created runlog file’
open(‘c:/Data/Exist1not2.txt’, “w”) {|f| f << “List of files from
Original not in Processed\n”}
puts ‘Created Exist1not2’
open(‘c:/Data/Exist2not1.txt’, “w”) {|f| f << “List of files from
Processed not in Original\n”}
puts ‘Created Exist2not1’
Walk the Original Directory Tree and check for files and matches
Find.find(dir1original) do |path|
if File.file? path
second = path.gsub(dir1original,dir2processed)
if File.file? second
puts ‘Found’
if File.size(path) != File.size(second)
puts ‘Not same size’
#Now we will have to look at the data
puts open(path) { |f| f.read(20)}
puts open(second) { |f| f.read(20)}
#search out parsdate for possibly parsing the date data
#need help here on
# read files into an array
# date based calculations
# merging the files
else
puts 'Complete Match'
# if file.cmp(path, second)
end
else
filename = path.gsub(dir1original, '')
puts filename + ' Not Found'
# an alternate method to get the file name
puts File.basename(path) + ' Not Found'
puts File.basename(path, ".csv") + ' Not Found'
open('c:/Data/Exist1not2.txt', "a") {|f| f << filename +"\n"}
File.copy(path,second)
end
end
end
So, some help on the arrays would be GREATLY Appreciated.
Snoopy