Rails testing uses huge number of AUTO INC ids in db


#1

Hi there,

I’ve recently begun writing tests in rails and I’ve run into a problem.

After recreating my test-database and running my few tests ONCE the auto
increment ID setting for some of the tables in the test-database is set
to a very high number, as in:
… ) ENGINE=InnoDB AUTO_INCREMENT=841549529 DEFAULT CHARSET=utf8 …"

when running SHOW CREATE TABLE table_name.

The problem is that I’ve set some of my column data types to be
MEDIUMINT (as opposed to rails’ standard INT in mysql) to save memory
consumption on our db-server. When running a test these column is being
“maxed out” and hence the tests produce unusable results.

Should I just accept this fact and change my mediumint columns to int or
is there a better way?

If I have not explained myself clear enough, please say so.

And on a side-note: why DOES rails “use” so many ids when running very
spare and simple anyway?

Thanks in advance.


#2

Hi,

On Fri, Feb 6, 2009 at 11:07 AM, Rasmus N.
removed_email_address@domain.invalid wrote:

MEDIUMINT (as opposed to rails’ standard INT in mysql) to save memory
consumption on our db-server. When running a test these column is being
“maxed out” and hence the tests produce unusable results.

This sounds an awful lot like premature optimization. :slight_smile:

Should I just accept this fact and change my mediumint columns to int or
is there a better way?

And on a side-note: why DOES rails “use” so many ids when running very
spare and simple anyway?

Fixtures use sparse, nonsequential IDs that are based on the fixture
label. This allows a bunch of the nice features of fixtures, like
specifying relationships by label instead of ID. If you find that you
really need to keep this from happening, just give all your fixtures
IDs.

~ j.


#3

John B. wrote:

Hi,

On Fri, Feb 6, 2009 at 11:07 AM, Rasmus N.
removed_email_address@domain.invalid wrote:

MEDIUMINT (as opposed to rails’ standard INT in mysql) to save memory
consumption on our db-server. When running a test these column is being
“maxed out” and hence the tests produce unusable results.

This sounds an awful lot like premature optimization. :slight_smile:

Should I just accept this fact and change my mediumint columns to int or
is there a better way?

And on a side-note: why DOES rails “use” so many ids when running very
spare and simple anyway?

Fixtures use sparse, nonsequential IDs that are based on the fixture
label. This allows a bunch of the nice features of fixtures, like
specifying relationships by label instead of ID. If you find that you
really need to keep this from happening, just give all your fixtures
IDs.

~ j.

Yes I just found out after playing around with mysql’s general log.
Thank you for your reply anyway!

I’ve decided to change the problematic column data types to int(11) -
rails’ default for mysql. The reason I made the (premature) optimization
in the first place was that I thought it would be no biggie to just use
smaller integer data types and that it would have no impact (other than
lower memory consumption) - that was not the case :slight_smile:


#4

Quoting Rasmus N. removed_email_address@domain.invalid:
[snip]

Yes I just found out after playing around with mysql’s general log.
Thank you for your reply anyway!

I’ve decided to change the problematic column data types to int(11) -
rails’ default for mysql. The reason I made the (premature) optimization
in the first place was that I thought it would be no biggie to just use
smaller integer data types and that it would have no impact (other than
lower memory consumption) - that was not the case :slight_smile:

How much does memory cost? How much does disk space cost? How many
users and
years will it take to recoup the cost of overriding the defaults? For
most
projects, programmer time is by far the expensive and scarce resource.
When
you have enough users to make the optimization worthwhile, you will have
plenty of revenue to pay for it. Until then, it’s premature.

I made a similar “optimization” several months back. I figured out
later that
I would have to have thousands of users with several hundred logged in
simultaneously before I begin to make up the costs of the time spent in
pre-mature optimzation.

The illustration I came up with is: when I started programming
professionally
(1973), 40MB of disk storage was around $40K, three years of my salary.
Last
year I bought 250GB of external USB disk storage for just over $100,
about two
hours of my salary. Figure four hours of an entry level programmer’s
salary.
The day and a half I spent squeezing three booleans into one byte of
disk
storage and trying to keep that one byte out of the model API will never
be
recovered.

When contemplating an optimization, be sure you are optimizing the
scarce or
expensive resources.

HTH,
Jeffrey