Establish_connection not honored by db:migrate

Short form: How can I convince migrate to create or modify tables in a
database other than the “current” one? In other words, how can I get
migrate to honor establish_connection :external in a model definition?

Details: I have a large “almost static” dataset that never changes as a
result of my Rails app, so it’s residing in a separate database. (It
gets refreshed nightly with external data.) I’d like to maintain its
schema using the Rails migration mechanism, but despite my efforts,
migrate only updates tables for the current environment (e.g.
_development or _test).

Here’s a stripped down example:

=== MODELS (in app/models):
class Hat < ActiveRecord::Base
establish_connection :external # use external database
end

=== SCHEMA (in db/schema.rb, lightly edited):
ActiveRecord::Schema.define(:version => 20100812225348) do
create_table “hats”, :force => true do |t|
t.string “style”
end
end

=== TABLE DEFINITIONS (in config/database.yml, only showing first two
entries):
development:
adapter: mysql
encoding: utf8
reconnect: false
database: dbtest_development
pool: 5
username: root
password: XYZZY
socket: /tmp/mysql.sock

external:
adapter: mysql
encoding: utf8
reconnect: false
database: dbtest_external
pool: 5
username: root
password: XYZZY
socket: /tmp/mysql.sock

=== SYNOPSIS:
$ rails new dbtest
$ cd dbtest
$ <edit config/database.yml to add external table spec>
$ rake db:create:all
$ rails generate model Hat style:string
$ <edit models/hat.rb to include establish_connection as shown above>
$ rake db:migrate

At this point, I peeked at the db using mysql, and noticed that the
dbtest_development db had a table defined for ‘hats’, but
dbtest_external did not. Since Hat was defined to establish_connection
to dbtest_external, I was already in trouble. I tried adding an
explicit establish_connection for the migration process like this:

$ rake db:rollback
$ cat > db/migration/xxxx_create_hats.rb
class CreateHats < ActiveRecord::Migration
def self.up
Hat.establish_connection :external # added this line
create_table :hats do |t|
t.string :style
end
end
def self.down
Hat.establish_connection :external # added this line
drop_table :hats
end
end
^D
$ rake db:migrate

But after the rollback and migration, the dbtest_development db STILL
had a tables defined for ‘hats’ and dbtest_external had no ‘hats’ table.

I’ve pored over this forum, googled the web, and glanced at the sources,
but I still don’t see how to get migrate to manage tables in a database
other than the “current” one.

Ideas? Pointers?

Thanks as always…

  • ff

On 13 August 2010 00:08, Fearless F. [email protected] wrote:

t.string “style”
pool: 5
username: root
$ rake db:migrate
Am I misunderstanding your question, or is it just a matter of doing
rake db_migrate RAILS_ENV=external
to migrate that db?

Colin

Colin L. wrote:

Am I misunderstanding your question, or is it just a matter of doing
rake db_migrate RAILS_ENV=external
to migrate that db?
Colin

Colin: your answer was perfect for the question I asked.
Unfortunately, I’d over-simplified the question.

I have two databases: the ‘external’ one holds static data – it never
changes as a result of my application. The ‘regular’ database
(dbtest_development, dbtest_test, or dbtest_production) holds data that
does change as a result of running the application.

I need to split my data between the external and the regular databasen.

I’ve thought about how rake db:migrate works – it modifies a database,
then asks the database to reveal its new structure to build a new
schema.rb. Given that, it doesn’t have enough information to know which
tables belong in the regular database and which belong in the external
database. In short, db:migrate can only work on one database at a time.
(Am I right?)

BUt your reply gave me an idea. I could define a rake task that
essentially calls:
rake db:migrate
rake db:migrate RAILS_ENV=external
thus always keeps a parallel structure between the two databases. It’s
a little clunky, because I would only be using one of the tables in the
external database, but that shouldn’t really matter.

Does this make sense? How would you approach this?

Best,

  • ff

Fearless F. wrote:
[…]

I’ve thought about how rake db:migrate works – it modifies a database,
then asks the database to reveal its new structure to build a new
schema.rb. Given that, it doesn’t have enough information to know which
tables belong in the regular database and which belong in the external
database. In short, db:migrate can only work on one database at a time.
(Am I right?)

Not to my knowledge. I’ve never worked with multiple databases and
Rails, but others have, and I don’t know of an issue with migrations.

BUt your reply gave me an idea. I could define a rake task that
essentially calls:
rake db:migrate
rake db:migrate RAILS_ENV=external
thus always keeps a parallel structure between the two databases. It’s
a little clunky, because I would only be using one of the tables in the
external database, but that shouldn’t really matter.

That seems like a terrible idea. It would put a lot of unnecessary crap
in both databases.

Does this make sense? How would you approach this?

Best,

  • ff

Best,

Marnen Laibow-Koser
http://www.marnen.org
[email protected]

Colin L. wrote:

On 13 August 2010 18:05, Fearless F. [email protected] wrote:

schema.rb. �Given that, it doesn’t have enough information to know which
external database, but that shouldn’t really matter.
I suppose you could test RAILS_ENV in the migration rb file and only
do the migration if it is appropriate for the env specified. Then
rake db:migrate RAILS_ENV=development
would apply relevant migrations to the development db
rake db:migrate RAILS_ENV=external
would apply relevant migrations to the external db.

I seem also to recall some talk of using establish_connection in the
migration files themselves. Will that work, or am I misremembering?

Colin

Best,

Marnen Laibow-Koser
http://www.marnen.org
[email protected]

On 13 August 2010 18:05, Fearless F. [email protected] wrote:

schema.rb. Given that, it doesn’t have enough information to know which
external database, but that shouldn’t really matter.
I suppose you could test RAILS_ENV in the migration rb file and only
do the migration if it is appropriate for the env specified. Then
rake db:migrate RAILS_ENV=development
would apply relevant migrations to the development db
rake db:migrate RAILS_ENV=external
would apply relevant migrations to the external db.

Colin

On 13 August 2010 21:05, Fearless F. [email protected] wrote:

this approach, as we wouldn’t want migrate’s state to get confused
(since it is written in the database itself).

A good point, but I think it would be ok. If you test the env inside
the migration class, so that for the ‘wrong’ db it just appears like a
do-nothing migration, then I think the fact that the migration has
been run should be recorded as normal and all should be well. If that
makes sense.

Colin

Marnen Laibow-Koser wrote:

That seems like a terrible idea. It would put a lot of unnecessary
crap in both databases.

Just a clarification: my heavy-handed approach would create parallel
table structures in both databases, but only the relevant tables would
get populated with any data. [That is, unless I used my migration files
to create fixture data, which I would never DREAM of doing! :wink: ]

Colin L. (Guest) wrote:

I suppose you could test RAILS_ENV in the migration rb file
and only do the migration if it is appropriate for the env
specified. Then rake db:migrate RAILS_ENV=development
would apply relevant migrations to the development db
rake db:migrate RAILS_ENV=external would apply relevant
migrations to the external db.

Heh – that could be pretty cool – I’d have to think a bit deeper on
this approach, as we wouldn’t want migrate’s state to get confused
(since it is written in the database itself).

Marnen Laibow-Koser wrote:

I seem also to recall some talk of using establish_connection in the
migration files themselves. Will that work, or am I misremembering?

In my first post in this thread, you can see that’s what I attempted
without success. But Colin’s approach (above) has promise. If you find
a pointer to a working recipe, I’d be in your debt.

I’ll continue to noodle on Colin’s approach and report back.

Thanks, all!

  • ff

Colin L. wrote:

A good point, but I think it would be ok. If you test the env inside
the migration class, so that for the ‘wrong’ db it just appears like a
do-nothing migration, then I think the fact that the migration has
been run should be recorded as normal and all should be well. If that
makes sense.

Colin

I like it, but wouldn’t rake db:migrate generate a radically different
schema.rb depending on which RAILS_ENV I specify? Or would that not be
a problem?

On 13 August 2010 00:08, Fearless F. [email protected] wrote:

Short form: How can I convince migrate to create or modify tables in a
database other than the “current” one? In other words, how can I get
migrate to honor establish_connection :external in a model definition?

Details: I have a large “almost static” dataset that never changes as a
result of my Rails app, so it’s residing in a separate database. (It
gets refreshed nightly with external data.) I’d like to maintain its
schema using the Rails migration mechanism, but despite my efforts,
migrate only updates tables for the current environment (e.g.
_development or _test).

All this hassle makes one wonder whether it might be possible to
combine the two databases into one. That would make life a lot
simpler, if it were possible.

Colin

On 13 August 2010 21:40, Fearless F. [email protected] wrote:

schema.rb depending on which RAILS_ENV I specify? Or would that not be
a problem?

Good point, I am well outside my comfort zone here. Googling for
rails migration multiple databases
produces a few links that might be useful.

Colin

Colin L. wrote:

All this hassle makes one wonder whether it might be possible to
combine the two databases into one. That would make life a lot
simpler, if it were possible.
Colin

Amen. But since mine is a table containing >700K 18-column records of
unchanging data, I’m not giving up without a fight.

FWIW, it appears that the new Arel mechanism may not handle external
tables properly – in at least one instance, I’ve had to covert an
ActiveRecord find() that worked in 2.3 into a find_by_sql() in order to
qualify the table name.

  • ff

On 14 August 2010 21:04, Fearless F. [email protected] wrote:

Colin L. wrote:

All this hassle makes one wonder whether it might be possible to
combine the two databases into one. That would make life a lot
simpler, if it were possible.
Colin

Amen. But since mine is a table containing >700K 18-column records of
unchanging data, I’m not giving up without a fight.

Why is that a reason for having it in a separate db?

Colin

Colin L. wrote:

Amen. �But since mine is a table containing >700K 18-column records of
unchanging data, I’m not giving up without a fight.

Why is that a reason for having it in a separate db?

Colin

Um, because I’m lazy? :slight_smile:

One table in the external database contains ‘highly decorated time
records’ with 15 minute granularity. Many of my other models depend on
this table, and my unit tests are greatly simplified since I know the
table is always available, precomputed and fully tested.

Since the test database is always rolled back between each test,
consider the alternative: I’d have to recompute a subset of the time
records before each test I run, and (a) that slows things down and (b)
it makes my head hurt trying to figure out which ones I’d need to
generate in advance of the test.

(I can hear Marnen chuckling in the background right about now…)

I am now thinking about keeping keeping it as a table in a separate
database, but “broadside loading” it into the _development or _test
database as a rake task using straight SQL commands when needed. That
might not be so painful.

  • ff

Colin L. wrote:

On 14 August 2010 21:04, Fearless F. [email protected] wrote:

Colin L. wrote:

All this hassle makes one wonder whether it might be possible to
combine the two databases into one. �That would make life a lot
simpler, if it were possible.
Colin

Amen. �But since mine is a table containing >700K 18-column records of
unchanging data, I’m not giving up without a fight.

Why is that a reason for having it in a separate db?

It probably isn’t.

Colin

Best,
–Â
Marnen Laibow-Koser
http://www.marnen.org
[email protected]

Sent from my iPhone

Fearless F. wrote:

Colin L. wrote:

Amen. �But since mine is a table containing >700K 18-column records of
unchanging data, I’m not giving up without a fight.

Why is that a reason for having it in a separate db?

Colin

Um, because I’m lazy? :slight_smile:

Too lazy to set up the easier solution?

One table in the external database contains ‘highly decorated time
records’ with 15 minute granularity. Many of my other models depend on
this table, and my unit tests are greatly simplified since I know the
table is always available, precomputed and fully tested.

Since the test database is always rolled back between each test,
consider the alternative: I’d have to recompute a subset of the time
records before each test I run, and (a) that slows things down

No You shouldnt be using enough records in any one test to slow things
down

and (b)
it makes my head hurt trying to figure out which ones I’d need to
generate in advance of the test.

No. As I’ve explained before, just generate the ones you need for each
test. Use factories for this. Simple. If your head hurts, then
you’re testing at too large a granularity. Stop overcomplicating!

(I can hear Marnen chuckling in the background right about now…)

Not chuckling so much as getting frustrated.

I am now thinking about keeping keeping it as a table in a separate
database, but “broadside loading” it into the _development or _test
database as a rake task using straight SQL commands when needed.

Drop that idea now.

That
might not be so painful.

On the contrary, it will be more painful. Stop fighting Rails already!

  • ff

Best,

–Â
Marnen Laibow-Koser
http://www.marnen.org
[email protected]

Sent from my iPhone