Im looking for some suggestions here. I need to process 200 GZ
compressed files each day that contain comma delimited information in
the following format:
HostName, FileName, DirName, Modified_On
Each file contains upwards of 200K rows. I need to compare this
information to the information recieved the day before to look for
files that have changed.
My current plan was to:
do read file and uncompress line
do import line into Mysql
run several comparison queries to find changes save changes to a table
automatically review changes based on rules, and those that are
leftover are unauthorized.
The MYSQL Import is SLOOOOW. Its taking 10 minutes per file.
Extrapolating this, it will take 2000 minutes, or 33 hours each day to
do just the import. Unfortunately, earth days only have 24 hours.
So, I need some way to compare todays file, to yesterdays and see
changes. Any good way to do this using the text files and skip the
import process? Im worried that this will slow down the comparison
process, but I’d like to try it…