How can I import an Excel file with a migration?

I have an Excel file and I want to do a migration with it. How can I do
it?

Switch it to a CSV format, and then get something like fastercsv to read
it.

On Fri, Mar 28, 2008 at 11:09 PM, John S. <
[email protected]> wrote:

I have an Excel file and I want to do a migration with it. How can I do
it?

Posted via http://www.ruby-forum.com/.


Ryan B.

Feel free to add me to MSN and/or GTalk as this email.

Converting it to a more Rails-friendly format is indeed the way to go.
However, if Excel import is a feature you want to support in your
application, you could have a look at http://roo.rubyforge.org/

But for your migration import, please go for a csv solution (or even
go a step further if it’s a one time operation and convert it to
something even more ruby like using regex or some text processing
script such as yaml or even Rails commands).

On 28 Mar 2008, at 13:47, Ryan B. (Radar) wrote:

Switch it to a CSV format, and then get something like fastercsv to
read it.

On Fri, Mar 28, 2008 at 11:09 PM, John S. <[email protected]

wrote:

I have an Excel file and I want to do a migration with it. How can I
do
it?

Best regards

Peter De Berdt

I’d add that this should not be part of a migration. Migrations are for
defining your database, not loading data. If you need to load data,
consider
a Rake task

lib/tasks/import.rake

namespace :db do

desc “load data from excel”
task :load_from_csv => :environment do
# your code goes here
end

end

Then rake db:load_from_csv

On Fri, Mar 28, 2008 at 9:30 AM, Peter De Berdt
[email protected]

On 28 Mar 2008, at 15:39, Brian H. wrote:

I’d add that this should not be part of a migration. Migrations are
for defining your database, not loading data. If you need to load
data, consider a Rake task

I don’t agree.

Migrations can be used for anything that has to do with setting up
your application:

  • define database structure, indices, …
  • alter the database structure
  • convert data in later migrations (e.g. convert a one-to-many to a
    habtm) if the need arises
  • add default data

Best regards

Peter De Berdt

On 28 Mar 2008, at 16:10, mridelmann wrote:

Hi Peter,

You should use FasterCSV and create your migration like so:

In your CSV file you would add data like so: “some data”,“some more
data”

The run rake db:migrate

Sure, and you’re loading data in your migration when you to this.

Best regards

Peter De Berdt

Hi Peter,

You should use FasterCSV and create your migration like so:

require ‘fastercsv’

class LoadContacts < ActiveRecord::Migration

def self.up
FasterCSV.foreach(‘db/data/contacts.csv’) do |row|

Assignment.create(:name=>row[0], :address=>row[1], :phone=>row[2],
:mobile=>row[3], :city=>row[4])
end

end

def self.down
Contact.delete_all
end
end

In your CSV file you would add data like so: “some data”,“some more
data”

The run rake db:migrate

Hope this helps,

Mike

@Peter De Berdt

You can disagree all you want, but imagine this scenario:

Let’s say you have a migration that creates a user table
Then you have another migration that adds three users to your table.

User.create :login =>“homer”, :email=>“[email protected]”,
:password=>“1234”, :password_confirmation=>“1234”
User.create :login =>“BART”, :email=>“[email protected]”,
:password=>“1234”,
:password_confirmation=>“1234”
User.create :login =>“lisa”, :email=>“[email protected]”,
:password=>“1234”,
:password_confirmation=>“1234”

Life is good…

Then you add a new field called “gender” to the user model

Then you take that new field and add validates_presence_of :gender

Now, let’s say it’s time to deploy.

You’ve been told by DHH and others that you’re SUPPOSED to use rake
db:schema:load to setup your database. Of course, that doesn’t contain
your
“add users” step, so you just run your migrations on the server to build
and
populate the database.

Guess what happens? Massive and epic FAIL. Your migration that adds
records
breaks because your model is validating a field that isn’t added until a
later migration. You could argue that moving them to a rake task
wouldn’t
fix this problem… but I argue it does. You are not supposed to go
back
and edit a previous migration.

Using Rake to boostrap your database allows you to

  1. separate loading of data from schema
  2. choose when you want to boostrap the database with fake data.

I used to use migrations for this all the time. I know better now after
working on projects with multiple people that this practice is bad. Just
because you can add records in a migration doesn’t mean you should.

So, do you still disagree? :slight_smile:

(Just my .02 on this…)

-Brian

On Fri, Mar 28, 2008 at 9:50 AM, Peter De Berdt
[email protected]

On 31 Mar 2008, at 14:11, Brian H. wrote:

  1. separate loading of data from schema
  2. choose when you want to boostrap the database with fake data.

I used to use migrations for this all the time. I know better now
after working on projects with multiple people that this practice is
bad. Just because you can add records in a migration doesn’t mean
you should.

So, do you still disagree? :slight_smile:

You definitely need to be careful when using models in migrations.
Stick Class User < ActiveRecord::Base; end; in your migration and
you’re good to go (assuming you were using the model purely as a
facilitator for inserting the data in the database.
It also depends to an extent on what type of app you’re working on.
There is only one ‘live’ deployment of one of the apps I work on, it’s
a custom built app that runs our systems. No one is ever going to
create an empty database and run rake db:migrate. Having things like
this in migrations means that when we rollout a new version to our
servers it’s basically impossible for us to forget to run data
insertion/massage tasks, and any that makes deployment less risky is a
win for me.

Fred

On 31 Mar 2008, at 15:11, Brian H. wrote:

=
[email protected]”, :password=>“1234”, :password_confirmation=>“1234”

Guess what happens? Massive and epic FAIL. Your migration that adds
records breaks because your model is validating a field that isn’t
added until a later migration. You could argue that moving them to a
rake task wouldn’t fix this problem… but I argue it does. You are
not supposed to go back and edit a previous migration.

… unless the application has never been deployed before.

Using Rake to boostrap your database allows you to

  1. separate loading of data from schema
  2. choose when you want to boostrap the database with fake data.

I mentioned adding default data, not fake data. I agree fake data
should be loaded through a rake task.

I used to use migrations for this all the time. I know better now
after working on projects with multiple people that this practice is
bad. Just because you can add records in a migration doesn’t mean
you should.

So, do you still disagree? :slight_smile:

respond_to? as a condition on the validation can help you out here.

That said, I do understand your gripes with populating the database
from migrations, but having had to take over an existing rails project
with “data import from a bunch of yaml files” (or any other importable
file for that matter), let me plot out what happened to me there:
it used an import of several linked tables, forcing the original
developer to fill in id and foreign keys by himself to support the
relationships. That led to numerous problems on existing deploys as
default data was being added along the way. The same problems you
mention started popping up, but effectively made things worse. Some of
the yaml files were imported and linked to wrong records while others
failed because certain records already existed.

The key rule to follow here is to make sure your data is pretty stable
after import to be sure. Default data to me is for example: belgian
postal code tables (which is just a simple lookup table), default
preferences, where I’m using a custom Ruby class to handle preference
saving in certain fields, basically there’s no way it can go wrong
there.

To ensure migrations don’t break, we have some tests in place that
basically migrate from a number of different migrations along the way,
to see if they still run and return the same result.

Best regards

Peter De Berdt

That’s why you should re-define your classes that may be affected from
inside the migration.

For instance,

in the model

class User < ActiveRecord::Base
validates_presence_of :password
end

class User < ActiveRecord::Base; end
class AddFieldToSomething < ActiveRecord::Migration
def self.up

do something

end
def self.down

undo something

end

That way, the migration won’t pick up the validations at all, and
you’ll be able to do whatever you need to within them. Works well
within data migrations, too.

The reason this is the case is that ActiveRecord uses lazy loading and
also dynamically finds your class files, so what normally happens when
one of your AR classes is referenced, is that it lazy loads it at the
time it’s needed. However, if you’ve already defined the class in the
migration file, it won’t bother to lazy load it, because you’ve
essentially already “loaded” it (ie defined the class).

Julian.

Learn Ruby on Rails! CHECK OUT THE FREE VIDS (LIMITED TIME) NEW VIDEO
OUT 3rd APRIL
http://sensei.zenunit.com/

Yes, that works… today. Will it work later?
It’s been mentioned twice on this thread already… data in migrations
breaks down during deployment and migrations should not be used for
deployment. if you are using migrations to build a states table and
populate it, for example, you’re going against the grain. Migrations are
developer tools, not deployment tools.

I recommend this well-written and thought-provoking discussion -
http://quotedprintable.com/2007/11/16/seed-data-in-rails

On Tue, Apr 1, 2008 at 9:03 PM, Julian L. [email protected]

@Fred:

On Mon, Mar 31, 2008 at 9:56 AM, Frederick C. <
[email protected]> wrote:

Having things like
this in migrations means that when we rollout a new version to our
servers it’s basically impossible for us to forget to run data
insertion/massage tasks, and any that makes deployment less risky is a
win for me.

That’s what scripted deployments are for. You have your recipe for your
app, and you just make sure the cold deploy calls the rake task to
bootstrap
the database. You’re not going to forget anything that way. I’ve been
bitten
enough by this issue that I feel very strongly about it.

Again, you’re not supposed to be using migrations to deploy your app.
(
http://dev.rubyonrails.org/changeset/8124) so that’s why I keep them
separate.

Sigh… the discussion was good. I don’t necessarily agree with the use
of
fixtures… but the fact that he used a rake task to do the loading was
my
point.

I’m just saying that “The Rails Way” (at least according to the docs) is
to
not use migrations for rebuilding your database when you deploy. That
infers
that loading data in a migration is not an optimal practice.

Take a look at Mephisto. There’s a bootstrap rake task that you use to
set
up your database.

But there’s nothing preventing people from going against the opinions of
the
Rails team. So you do what you want, and I’ll do what I want, and we can
agree to disagree.

Have a good day. If you’re going to RailsConf, let’s have a drink on me!

On Wed, Apr 2, 2008 at 2:32 AM, Peter De Berdt
[email protected]

On 02 Apr 2008, at 05:43, Brian H. wrote:

Yes, that works… today. Will it work later?
It’s been mentioned twice on this thread already… data in
migrations breaks down during deployment and migrations should not
be used for deployment. if you are using migrations to build a
states table and populate it, for example, you’re going against the
grain. Migrations are developer tools, not deployment tools.

I recommend this well-written and thought-provoking discussion - http://quotedprintable.com/2007/11/16/seed-data-in-rails

Oh my, the fixtures loading solution.

How will you load default data when two of the tables are interlinked
in some way using a foreign key? Manually enter them from your
fixture? Use the new fixture system where a number of totally random
ids are filled in, but you can use the yaml key as the foreign key?

Best regards

Peter De Berdt