Question on Db Table design

sauy7 · December 17, 2005, 7:43pm

Hi

Suppose one has a db to track bookmarks for multiple users.
Let’s assume that there are ten’s of thousands of users, and that
an average user may have 100 bookmarks. (Assume here that no
two bookmarks are the same.)

Is it a common practice to have a bookmark table for ALL users, thereby
producing a table that has on order one million rows.
Or, is it better to somehow group the users so the bookmark tables are
smaller.

If tables are split, is there a rule of thumb as to how many tables a
db should have?

Thanks

Jim F.

sauy7 · December 17, 2005, 8:43pm

One million records really isn’t that big. Assuming 1K/record,
that’s only 1 GB. Heck, that might even fit completely in RAM.

That said, here’s some things to consider:

Some old OSs have file system limits (4GB/32-bit) that preclude
super-large tables, but that is largely a thing of the past unless
you are working on a legacy system.

Have to use indexes on large tables–but not too many indexes
because too many indexes slow down inserts.

Might consider moving seldom-used fields into a separate table if
those fields are allowed to be NULL, making the bookmarks table as
‘slim’ as possible.

Might consider changing all VARCHAR fields to CHAR fields since
searches seem to go faster when the record size is fixed.

Might consider moving all old records that haven’t been accessed
in X months to an ‘archive’ table that is only SELECTed if not
found in main bookmarks table. Add an ‘inactive’ flag to each
record and set it if record has been inactive after Y months so
you can skip those records easily.

Some databases (namely, Oracle) will allow you to spread a single
table over several spindles, which speeds things up a lot.

If you do a lot of sorts on the records it will probably be easier
on the db to pull all the records off the db and sort on a client
machine.

Regardless, I wouldn’t make many efficiency decisions right
now until it becomes obvious that a single table isn’t going to
give you response times as fast as you require.

sauy7 · December 17, 2005, 11:07pm

My two cents worth.

I used to work on a database with 3 million rows. The reason that
dbs are so good at this is because they use indexes. Indexes search
in a sort of binary way. Powers of 2 are really interesting numbers.
For instance, what how the power rises and the number rises.

2^1 = 2
2^5 = 32
2^10 = 1000
2^20 = 1000000
2^21 = 2000,000

Thus when you double the records from 1M to 2M you add one twentieth
in the time taken to find an indexed record. And as the number of
records gets bigger, the increased workload per search is getting
progressively smaller (the increase, that is).

So, to a db 2M records is a joke and would be an indiscernible
difference from 1M records.

I doubt very much whether the cost of programming re more tables is
worth it for the savings you would obtain on hardware to have the
necessary extra power.

My 3 million row db used to work like a charm. Lightening fast,
brilliant searches, combining 5 and 6 tables at a time from a
collection of 65 (mysql, by the way).

I recommend deep breaths and much relaxation. Your db can handle the
job. They really are incredible examples of fine programming and the
extraordinary power of modern computers.

bruce

PS. 4M is 4x 1M but the total workload on the db to find any given
indexed record increased from 20 to 22, an increase of 10%. Don’t you
find that awesome?