Problems with ever-increasing ID value

chrislo · July 31, 2006, 10:27pm

I have a script that I am using to populate a database from a CSV file:

RAILS_ENV = ‘development’
require File.dirname(FILE) + ‘/…/…/config/environment’

require ‘csv’

Destroy existing data

Residence.destroy_all

IMPORT DATA FROM apartments.csv

db_file = “#{RAILS_ROOT}/db/apartments.csv”

CSV.open(db_file, “r”, ‘|’) do |row|

r = Residence.new
r.name = row[0]
r.number = row[1]
r.street = row[2]
r.zip = row[3]

r.save
end

When I run this script it correctly destroys all of the existing data,
but new rows are created with increasingly large ID number – ie the
“old” IDs are not reused. Has anybody seen this problem ? Perhaps I am
misunderstanding the destroy_all method ? I would like the new rows to
begin with ID = 1 .

My “Residence” model has the following line, if it is revelevant:
has_many :comments, :dependent => :destroy

I would appreciate any suggestions,

Chris

chrislo · July 31, 2006, 10:37pm

On Jul 31, 2006, at 9:27 pm, Chris L. wrote:

has_many :comments, :dependent => :destroy

I would appreciate any suggestions,

Chris

Chris,

Your database will deliberately does not re-use IDs. You didn’t say
what database server you are using, so I don’t know how to override
it, but it will be in the docs. For example, SQL Server has an
IDENTITY INSERT option, and I think in Postgres you can reset the
sequence number (never actually needed to do that). Don’t know about
MySQL. In al cases you will have to call hand-rolled SQL to do this
using the execute method.

Depending on how much data you are storing, it might not be a
problem. A 32-bit int can store positive values up to around 2
billion, so you would need to be doing massive bulk loads on a
regular basis to run into the limit any time soon.

Ashley

chrislo · July 31, 2006, 10:42pm

Ashley and Peter,

Depending on how much data you are storing, it might not be a
problem. A 32-bit int can store positive values up to around 2
billion, so you would need to be doing massive bulk loads on a
regular basis to run into the limit any time soon.

Thank you both for your prompt help, I understand now that this is a
database issue. I am using MySQL . Perhaps I can get some inspiration
from the “migrations” code. As you suggest I won’t be running out of IDs
soon !

It’s more of an annoyance than anything as my URLs have /show/{some
large number} in them … perhaps something I need to fix with routing.

Kind regards,

Chris

chrislo · August 1, 2006, 12:40am

Chris L. wrote:

When I run this script it correctly destroys all of the existing data,
but new rows are created with increasingly large ID number – ie the
“old” IDs are not reused. Has anybody seen this problem ? Perhaps I am
misunderstanding the destroy_all method ? I would like the new rows to
begin with ID = 1 .

…

I would appreciate any suggestions,

Chris,

I have encountered this problem during development while using
migrations to modify a legacy db. I could not find an easy way to
reset the sequence in MySQL so I resorted to dropping the table,
recreating it and repopulating.

Steven

chrislo · August 1, 2006, 12:54am

A database reusing IDs is bad bad bad, so it’s not an “issue.” If you’re
afraid of running out of IDs, use bigints. If you really want to
“compact” the IDs, a painstaking export/recreate database/import is
possible.

Joe

chrislo · August 1, 2006, 1:02am

On Jul 31, 2006, at 9:42 pm, Chris L. wrote:

Thank you both for your prompt help, I understand now that this is a
database issue. I am using MySQL . Perhaps I can get some inspiration
from the “migrations” code. As you suggest I won’t be running out
of IDs
soon !

I did try to get on the MySQL site just, to satisfy my curiosity
about how you reset the IDs, but the whole site is running like a dog
and I gave up. Must be powered by MySQL ;o) It will explain on
there somewhere though.

It’s more of an annoyance than anything as my URLs have /show/{some
large number} in them … perhaps something I need to fix with
routing.

I wouldn’t do anything complicated to work round the issue you’ll
probably tie yourself up in knots. What you could do is update
existing records rather than deleting them and re-inserting, but I’d
only do this if they actually represent the same apartment - I can’t
think of anything more confusing than a database ID or URL that
starts randomly changing it meaning! Or, you could add your own
application-generated id column, maybe “url_id”, that you could
increment with every row in the csv, but I personally don’t like that
idea much.

Ashley

chrislo · August 1, 2006, 10:23am

Thank you everyone for your suggestions. I think this is something I
don’t need to get too worried about at the moment, when it comes to
production I will make an initial import of all the data, and will have
“nice” ID numbers again !

I use migrations a lot, so I’m a little reluctant to drop tables from
another script, but I will give the MySQL code a try later.

Kind regards,

Chris

chrislo · August 1, 2006, 1:47am

On 7/31/06, Chris L. [email protected] wrote:

Thank you both for your prompt help, I understand now that this is a
database issue. I am using MySQL . Perhaps I can get some inspiration
from the “migrations” code. As you suggest I won’t be running out of IDs
soon !

It’s more of an annoyance than anything as my URLs have /show/{some
large number} in them … perhaps something I need to fix with routing.

execute(“alter table table_name auto_increment = 1”) should do the
trick.

See http://dev.mysql.com/doc/refman/5.0/en/example-auto-increment.html
for more information.

Mark

chrislo · August 1, 2006, 2:17pm

Chris L. wrote:

When I run this script it correctly destroys all of the existing data,
but new rows are created with increasingly large ID number – ie the
“old” IDs are not reused. Has anybody seen this problem ? Perhaps I am
misunderstanding the destroy_all method ? I would like the new rows to
begin with ID = 1 .

Look up the MySQL “TRUNCATE” command. It removes all data from a table
and restarts the autoincrement(s) from 1.

–Al Evans