Need help with ID-less legacy databases

I’m a newbie converting a large Foxpro database (188 MB) to Ruby on
Rails (RoR) so we can get reports out of it. Our accounting service
sends us weekly updates of the entire database. We don’t enter any
data (that’s done at the accounting service). The biggest tables are
Customers (now 15,000 rows, 2.5 MB) and Transactions (525K rows, 112
MB). Each of the Foxpro tables has its own key field, which is quite
different from RoR’s autoincrementing inntegers.

For example, the Foxpro Customers table is indexed by Account Number,
a 5-digit string that ranges from ‘00001’ to ‘15221’. So it’s almost
the same as an ID field, but not quite.

I have it running using a standard RoR setup including the usual
autoincrementing ID fields. Right now, before loading new data, I do a
Customer.delete_all and then import using the superb DBF Ruby gem.
Ditto the Transactions table (I call it Xactions so as not to confuse
it with the transaction method). Each time I do this, the table starts
incrementing the IDs from where it left off. So the second time I load
Customers it starts with ID 15222, the third time with 30487 (more
customers), etc. Therefore, :id has no relationship with the Account
Number. Moreover, in order to hook up the other tables with Customers,
I have to go through each row, look up the account number, and find
the associated :id. Given a database of this size, it takes HOURS.
Worse, the problem with the Xactions table is we’ll run out of :id
numbers at some point (11 digits).

I tried using the Account Number string as the foreign key, but RoR
REALLY wants an integer field and wouldn’t accept it.

I thought I’d solve this by setting :id => false in the table
migration and then adding a t.column :id :integer right afterwards.
Then, when I load the data, I’d convert the Account Number string to
the :id integer. Simple. Sweet.

The PROBLEM is that RoR has NO WAY to save or update data without an
existing :id. As the Thomas/Hansson Agile Web D. book makes
clear, RoR just won’t save anything in this situation.

I can think of several solutions:

  1. Keep the existing setup and just load the data after hours. So what
    if it takes all night?

  2. Delete the table and then recreate it, so the :id of the reloaded
    table starts at 1. I don’t know how to do this programmatically. Is
    there another way to reset the :id counter without deleting the table?

  3. Doing the shortcut with MySQL would be a piece of cake, but I don’t
    know the syntax to save or update a record in an ActiveRecord context
    behind RoR’s back.

Any suggestions?
—Jim Gagne—

for no. 2 use Customer.connection.execute(‘TRUNCATE customers’) this
will reset the id back to 1

Hi Nic,

something like this should do the trick, it’s in the MySQL manual
(http://dev.mysql.com/doc/refman/5.0/en/create-table.html)…

CREATE TABLE rails_customers (
SERIAL id, …
)
SELECT *
FROM customers;

I don’t think that you can takeover the provided IDs, you can store it
in another column.

If you can you maybe need to typecast the provided id in the SELECT
statement with:
SELECT UNSIGNED customers.id AS id, …
and replace:
-SERIAL id
+UNIQUE(id)

I only got PostgreSQL here so I can’t test it, so be sure to test this
within a transaction or even with a development db…

UPDATE, DELETE and SELECT should work w/o SERIAL id. If you want to
create new rails_customers than you maybe need to set the id of the new
record manually:

before_save :get_id
def get_id
write_attribute :id, self.class.max(:id) + 1
end

You need to reserve the id somehow… don’t know how…

Good luck
Florian