Newb: load lots of data into database


I’m just starting to venture into Ruby on Rails, and I was wondering
what the best way to populate my databases with exisiting data in the
form of csv files that don’t map exactly to the columns (ie require some
kind of manipulation/data cleansing).

When I used PHP I would just write a little PHP or Perl script to load
directly into a MySQL table, but I was wondering if RoR had a way of
abstracting that and I was wondering what the corresponding Ruby code
might look like…


Rails doesn’t have a specific method for dealing with this, but you
might write Ruby scripts to manipulate the data and insert the
manipulated data into the DB. Depending on your data you may be able
to use ActiveRecord in your scripts as well.


On Aug 20, 9:13 pm, Aaron S. [email protected]

might look like…


LOAD DATA INFILE is pretty flexible. ISTR that it can take CSV’s
directly, except that Excel would put in random double quotes around
certain cells, and they needed to be cleaned out, if that’s where
data’s from

If you can clean up the data, perhaps this CSV importer plugin might be



This is similar to a problem I had normalizing data for neural
network training. Ruby is very good at handling csv data and you can
use ActiveRecord outside of rails. My recommendation to you would be
to write a script to read in the data (File and IO are the primary
classes), using ActiveRecord. Using active record would allow you to
use your object models to insert the data into the data base.

If you have a lot of data, and going model by model would be too
slow, you will want to look at the DBI interface. I don’t know of a
good Ruby SQL loader.