I had some concerns about breaking Rails table relationships (:has_many, :has_and_belongs_to_many, :belongs_to, :acts_as_*, etc) when moving multiple databases (using the same application) into one. I seem to be stuck when trying to wrap my head around a good database design for what I would like to do or if I am just completely not thinking straight. I guess this isn't related specifically to Rails but to overal DB design. So, for example I have two sites on two different databases - same application. I want to consolidate and move Site B into Site A's database. When I move it, the primary key column could get messed up and break relationships with other tables. I think. SITE A) table "pages" id, name, account_id 1,"Page One", 55 2,"Page Two", 55 3,"Page Three", 55 SITE B) table "pages" id, name, account_id 1,"My First Page", 453 2,"My Second Page", 453 3,"My Third Page", 453 When I import the rows from Site B to Site A, I get the following: 1,"Page One", 55 2,"Page Two", 55 3,"Page Three", 55 4,"My First Page", 453 5,"My Second Page", 453 6,"My Third Page", 453 The problem being the ids 4,5 and 6 which just pick up the autoincrementing column and if this pages table :has_many "files" then in that "files" table's "page_id" column will be pointing to the wrong row in the "pages" table. Another thing to throw into the mix is that I build my URLs based on the ID column. Now, I heard on a podcast somewhere that it's generally not a good idea to expose the IDs of your tables but I didn't fully understand why :) So, if site B had a public URL such as http://www.siteA.com/pages/view/1 and the database moves to the new location than it won't find "My First Page" ... it will find Site A's "Page One". Any best practices I am missing out on? Basically I am trying to consolidate my sites into using one database. But taking and restoring backups seems to be a nightmare. Jeff
on 2006-06-01 21:51
on 2006-06-02 11:27
I came across *set*_*primary*_*key* today and I wonder if looking into this would help me out? There must be some best practices for keeping the data movable without relying on the primary keys for everything. I just don't get it?
on 2006-06-02 17:02
Jeff W. wrote: > I had some concerns about breaking Rails table relationships (:has_many, > :has_and_belongs_to_many, :belongs_to, :acts_as_*, etc) when moving > multiple > databases (using the same application) into one. I have no direct experience of this myself, but an experienced colleague explained to me that the preferred approach for handling data migration is to use an intermediate database between the source databases and the target databases, and to use scripts to: - extract data from the sources into the intermediate database - validate the data - transform data in the intermediate database - load the target database from the intermediate database. The intermediate database may contain additional tables and columns to help with transformation and validation - for example, you could keep your original primary keys and foreign keys there (in non-key columns), and the name of the source database, so that you could then create the new foreign keys. Everything should work from scripts, because migration steps must be repeatable. This is necessary because: - validation may exposes data quality problems - these must be fixed in the source databases before rerunning extraction, validation and transformation; - data will continue to be updated in the source databases until the final migration and cutover is carried out (for large data migrations this is usually planned for a weekend of scheduled downtime). > [snip] > Another thing to throw into the mix is that I build my URLs based on the ID > column. Now, I heard on a podcast somewhere that it's generally not a good > idea to expose the IDs of your tables but I didn't fully understand why :) > So, if site B had a public URL such as http://www.siteA.com/pages/view/1 > and > the database moves to the new location than it won't find "My First Page" > ... it will find Site A's "Page One". This is something that has been bothering me, too. When I've discussed Hibernate's use of synthetic primary keys (i.e. generated keys with no business meaning) with the same colleague, who has done data modelling for very large applications for over 25 years, he said that he had no problem at all with synthetic keys, so long as they are *never* exposed to users. I have been naively wondering whether URLs might be an exception to this (and meaning to discuss it with him). Your example demonstrates conclusively that bookmarkable URLs, at least, should be treated as part of the user interface and should be subject to the same rule. Thanks! I shall need to bring this to the attention of a couple of projects I'm working with. If you want the old bookmarks to work with the new database you will have to keep the old site and key information in the new database (or in a separate mapping database) and redirect accordingly. I hope that this helps, even if the answers are not what you were hoping for. regards Justin P.S. I shall probably be offline over the weekend.