We are importing some data into the lookup tables in a mysql database in
the migration and it’t insanely slow. Even more than I’d expect from
Ruby. I extracted the import of a single table into a script and trimmed
the CSV down to some 5000 rows … it still takes some 157s. While a raw
DBI+DBD::msyql import in Perl takes some 72s with autocommit and 1-2s
while commiting only once every 100 rows.
I don’t think I could get thouse numbers as ActiveRecord does have to
have some overhead and especially since it doesn’t prepare the INSERT
statement using placeholders, but I do think even Ruby can do better.
I tried to do the commit trick, but seems Ruby doesn’t like the least
bit more than I like Ruby and it doesn’t work. What am I doing wrong?
require ‘rubygems’
require ‘active_record’
require ‘yaml’
require ‘logger’
require ‘fastercsv’
dbconfig = YAML::load(File.open(‘M:\ecom\config\database.yml’))
ActiveRecord::Base.establish_connection(dbconfig[‘development’])
#ActiveRecord::Base.logger = Logger.new(‘zkSqlBatch2.log’, File::WRONLY
| File::APPEND | File::CREAT)
#ActiveRecord::Base.logger.level = Logger::DEBUG
#ActiveRecord::Base.colorize_logging = false
class ZipCode < ActiveRecord::Base
end
ZipCode.delete_all
started = Time.new()
i = 0
ZipCode.connection.begin_db_transaction()
FasterCSV.foreach(“M:/ecom/db/csv_lookup_data/postal_codes_some.csv”,
:col_sep => “|”) do |row|
ZipCode.create(
:postal_code => row[1] , :country_id => 159, :city => row[3],
:latitude => row[4],
:longitude=> row[5],:state_id => row[6]
)
i+=1
if (i > 100) then
i = 0
ZipCode.connection.commit_db_transaction
ZipCode.connection.begin_db_transaction
end
end
ZipCode.connection.commit_db_transaction
puts “Taken #{Time.new - started} s”
The Logger was there so that I could see what does AR send to the
database. BTW, I’d expect to be told that I have to set the logger
BEFORE AR connects to the database, since otherwise it’ll be silently
ignored. Or maybe I would not. In either case I really wonder how am I
supposed to do that eg. in a migration.
Jenda