I have a text file with a couple hundred thousand rows that I use ruby
to massage and then active record to insert each row after being
massaged into the DB. Now… inserting a couple hundred thousand rows
using ActiveRecord and ruby is crazy. I know.
It was beautiful to write my rake task to do everything for me but now
the problem is time. For 1,000 records it is taking me on average
1:00 minute to process. I have an old PowerBook 867.
Now, I noticed when looking at the log file that the COMMIT opperation
is what is taking so much time. I was wondering if I can extend the
commit so the commit is done at the end or every X amount of updates?
You might want to consider exporting to a file which can be bulk
loaded into the DB. We are dealing with data sources which have tens
of millions of rows and bulk loading is the only way to effeciently
load it.
If you absolutely want to manage the commit manually you can, via
ActiveRecord::Base.connection methods:
is what is taking so much time. I was wondering if I can extend the
commit so the commit is done at the end or every X amount of updates?
Or any other suggestions on how to speed this up?
ActiveRecord::Extensions (for MySQL) handles bulk inserts using
multi-value insert statements. It goes up to 40x faster then current
ActiveRecord behavior.
API looks like:
MyModel.import( columns, array_of_value_sets, options_hash )