Database recommendations?

paulbutcher · February 22, 2007, 5:10pm

We have a number of very successful RoR sites that have been up and
running for almost a year now. We’re currently using MySQL as the
database, but seem to be running into its limitations, particularly
regarding it rather basic clustering support.

As a result we’re thinking about moving to another database. For obvious
reasons, this isn’t something we want to do lightly!

To that end, I’d be very grateful for any recommendations regarding
database engines and RoR. Are there any lurking “nasties” if we move to
(say) Oracle? Sybase? I’d very much welcome any advice/war stories from
anyone who’s made a similar switch!

Thanks in advance for your help!

paul.butcher->msgCount++

paulbutcher · February 22, 2007, 6:30pm

I’m not a fan of MySQL personally, but are you sure that migrating
over to another database system is the right choice? MySQL offers
clustering packages that do work, but they aren’t free as far as I
know.

What kind of load are you throwing at your mysql server(s)? Over here,
we got huge databases (tables that top 10+ million rows) that are
being hit with up to 1000 selects / second and 400 inserts / second.
Our mysql cluster is spread out across two datacenters.

Do you really want to migrate to another database? MySQL can probably
handle anything you throw at it performance wise if you know what you
are doing. This, of course, doesn’t change the feature set and
mysqlisms though (which I personally like to avoid).

-carl

On 2/22/07, Paul B. [email protected] wrote:

database engines and RoR. Are there any lurking “nasties” if we move to

–
EPA Rating: 3000 Lines of Code / Gallon (of coffee)

paulbutcher · February 22, 2007, 7:28pm

Paul B. wrote:

Our issue isn’t performance - it’s reliability.

We currently have a master/slave setup, where the slave is primarily
intended for failover in the event of the master failing. We’ve had a
couple of occasions where the replication has failed recently for
reasons we are currently unable to diagnose (the master has continued to
run just fine, but the slave has crashed). Reestablishing the
master/slave relationship requires that the slave has a perfect copy of
the database, which means disabling our application in order to create a
consistent dump of the database on the master.

The last time we had to do this, it took around 10 minutes to perform
the dump, which meant 10 minutes downtime. We run a 24/7 service, and 10
minutes downtime is unacceptable. And our database is growing
exponentially - we expect that the same operation in 6 months time is
likely to take an hour - an hour’s downtime would be a complete
disaster!

Sorry for the less-than-entirely-clear inital e-mail!

Thanks,

paul.butcher->msgCount++

Paul,

I think your real problem is trying to diagnose why your slave is
crashing before you assume that you are running up against the upper
limits of MySQL replication. What are the details of your master/slave
setup?, (a)synchronous?, distributed query load to slave?,
auto-generated keys? load?.

If money is no object, and your DBA is either not present or incompetent
then do yourself a favor and hire a DBA(s) to get at the cause of your
problem before you go through migration/portability/driver hell and save
yourself a wad of cash.

It should be fairly straight forward to look at the db logs on the slave
to see the last statement that went through before the crash… I also
guess it’s obvious to assume that your slave is a dedicated db server…
right? Is the crash dependant upon load?

I feel your pain and I have been there… good luck

ilan

paulbutcher · February 22, 2007, 7:02pm

Thanks for the response, Carl,

Carl L. wrote:

… are you sure that migrating
over to another database system is the right choice?

In a word, “no”

We are hitting the limits of the MySQL configuration we currently have
and would like to examine our options. Sticking with MySQL is one of
them! And we won’t move unless there is a powerful reason to do so.

MySQL offers
clustering packages that do work, but they aren’t free as far as I
know.

I just reread my original message and realised that I said clustering
when I meant replication (blush).

Our issue isn’t performance - it’s reliability.

We currently have a master/slave setup, where the slave is primarily
intended for failover in the event of the master failing. We’ve had a
couple of occasions where the replication has failed recently for
reasons we are currently unable to diagnose (the master has continued to
run just fine, but the slave has crashed). Reestablishing the
master/slave relationship requires that the slave has a perfect copy of
the database, which means disabling our application in order to create a
consistent dump of the database on the master.

The last time we had to do this, it took around 10 minutes to perform
the dump, which meant 10 minutes downtime. We run a 24/7 service, and 10
minutes downtime is unacceptable. And our database is growing
exponentially - we expect that the same operation in 6 months time is
likely to take an hour - an hour’s downtime would be a complete
disaster!

So we’re examining our options, and moving to another database is one of
them. Sticking with MySQL, if we can find a way to make it work for us,
is also an option. Spending money isn’t a problem - as long as we can
find a workable solution!

Sorry for the less-than-entirely-clear inital e-mail!

Thanks,

paul.butcher->msgCount++

paulbutcher · February 23, 2007, 12:05am

On Feb 22, 2007, at 1:13 PM, Paul B. wrote:

We know what was the last statement which went through before the
crash,
but it’s “just a query”. As far as we can tell, the problem is that
the
master and the slave got “out of whack” at some point in the past (and
we have no means to tell when this point was). The point of failure is
some stage after that when an otherwise entirely innocuous query runs.

Are we talking crash as in “boom” or crash as in halted replication?

If “boom”, process boom or OS boom?

–
– Tom M., CTO
– Engine Y., Ruby on Rails Hosting
– Reliability, Ease of Use, Scalability
– (866) 518-YARD (9273)

paulbutcher · February 22, 2007, 10:13pm

Ilan B. wrote:

I think your real problem is trying to diagnose why your slave is
crashing before you assume that you are running up against the upper
limits of MySQL replication.

I’d certainly welcome advice on how to perform this diagnosis. Part of
the problem that we have is that MySQL provides us with almost no help
whatsoever in diagnosing the problem.

We know what was the last statement which went through before the crash,
but it’s “just a query”. As far as we can tell, the problem is that the
master and the slave got “out of whack” at some point in the past (and
we have no means to tell when this point was). The point of failure is
some stage after that when an otherwise entirely innocuous query runs.

Diagnosing the problem is also not helped by the fact that it has only
happened twice in 14 months! Most of the time everything runs just fine.
And we’ve not found a way to reproduce it (we do have automated stress
tests, but they don’t seem to do the trick).

We’re thinking of moving database at least in part because, quite apart
from this problem, we know that we’re going to hit a wall with MySQL
replication anyway. The slave exists to provide a failover if the master
dies. Let’s imagine that happens 6 months down the line. We can failover
to the slave (assuming that the replication is working at the time!),
but at some point we’ll need to recreate a replication relationship from
the new master to a new slave. MySQL replication requires us to take the
master down while we create the slave. With a database which is an order
of magnitude larger than we have right now, which is the size we expect
it to be in 6 months, this is going to mean being down for an hour -
which is completely unacceptable for our app.

So we’ve been thinking of moving anyway - the current problems we’ve
been experiencing have just accelerated our thinking.

What are the details of your master/slave
setup?, (a)synchronous?, distributed query load to slave?,
auto-generated keys? load?.

Most of our tables are InnoDB, but we do have a couple of MyISAM tables
(because we use MySQL fulltext indices on those tables). The slave runs
read-only and replicates from the binary log on the master (which is a
very standard MySQL replication setup as far as I can tell). Just about
all tables are “standard” RoR style tables, so they have autoincrement
“id” columns, but are pretty much “plain vanilla” apart from that. No
stored procedures, no transactions (other than the implicit transactions
that InnoDB creates as part of its normal operation). We do use foreign
key constraints though (although I can’t see why that should cause any
problems).

It should be fairly straight forward to look at the db logs on the slave
to see the last statement that went through before the crash… I also
guess it’s obvious to assume that your slave is a dedicated db server…
right? Is the crash dependant upon load?

Yes, the slave is a dedicated server. It’s hard to say whether the crash
is dependent on load - as it’s only happened twice it’s hard to see a
pattern!

Thanks for any suggestions you can offer!

paul.butcher->msgCount++

paulbutcher · February 23, 2007, 12:37am

Paul,

Been watching this thread with interest, being a database person myself.

Asking for a database recommendation is a pretty dangerous thing to do :
DBMS
debates can get just about as heated as those over preferred editor or
IDE.

In many ways, the choice of DBMS comes down to two things -

Experience (folks choose what they know and are comfortable with)
Money (how much you are prepared to spend for the product and
support)

I am not a MySQL person myself (my preferred DBMS is IBM’s DB2, and for
that
my preferred operating platform is Linux). I have used it on occasion,
just
as I have been exposed to Oracle, MS SQL Server, Sybase, Informix and
IMS in
my years in the field.

My recommendation to you is that if you are generally comfortable with
MySQL
then that would be where you should stay. The obvious advice would be
to
raise a problem report with MySQL support. I would not advise anyone
who is
running a 24x7 database which is critical to the life of their business
to
run without some sort of support contract. I’m sure that MySQL support
would help diagnose your problem.

If you do want to consider a change, then you have the opportunity to
use
almost all of the databases currently available, albeit unsupported, for
free
these days. That’s true for DB2 (Express-C), Oracle (Express Edition)
and
SQL Server (Express Edition). But be aware that there all of these
free
editions have limitations on their use, some more restrictive than
others.
The basic details of these are -

DB2 Express-C : server can have maximum of 2 processors (these can be
multi-core however) and 4 gig of RAM. No limit on database size

Oracle Express : server can only have 1 processor (again multi-core) and
1 gig
of RAM. Database is limited to 4 gig in size. Only one database
allowed per
server. Also no 64-bit support (e.g. AMD64 or EM64T).

SQL Server Express : will only use one CPU on a server. 1 gig of RAM
only.
4 gig maximum database size. And, of course, only available on Windows

I think you will agree the DB2 Express-C is much less restricted than
the
others. I can thoroughly recommend it as an excellent platform for
24x7
operations. It runs beautifully on Linux, and especially on an x64-64
(AMD64 or Intel Xeon with EM64T) platform.

I’m giving a presentation in May at IDUG (http://www.idug.org) on using
DB2
with Ruby on Rails. It is nearly finished (the final version is due in
less
than a fortnight) and if any wants to help me review it then drop me
an
email. It goes through in some detail using the ibm_db2 adaptor from
IBM
Alphaworks. It doesn’t show you how to install DB2, since my audience
will
know this, but I can help there if you want : given a Linux box, an SSH
connection and a root password my record for going from tarball to
working
DB2 system, including sample database, is 12 minutes.

Anyway, as I said before : stay with what you know if you can resolve
the
issue. If you want to look elsewhere you could do a lot worse than DB2
!!!

HTH

Phil

paulbutcher · February 23, 2007, 12:15am

Tom M. wrote:

Are we talking crash as in “boom” or crash as in halted replication?

If “boom”, process boom or OS boom?

We’re talking crash as in “boom” followed by mysqld attempting to
restart, followed by “boom” followed by… (ad infinitum, until the log
files grow to be very large indeed!).

Here is an example from error.log of one of the crashes:

mysqld got signal 11;
This could be because you hit a bug. It is also possible that this
binary
or one of the libraries it was linked against is corrupt, improperly
built,
or misconfigured. This error can also be caused by malfunctioning
hardware.
We will try our best to scrape up some info that will hopefully help
diagnose
the problem, but since we have already crashed, something is definitely
wrong
and this may fail.

key_buffer_size=8388600
read_buffer_size=131072
max_used_connections=3
max_connections=100
threads_connected=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections
= 225791 K
bytes of memory
Hope that’s ok; if not, decrease some variables in the equation.

thd=0x73300490
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong…
Cannot determine thread, fp=0xbe5fecd8, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x80b18df
0x831073a
0x80e666f
0x80eb5de
0x816f87c
0x8170b9f
0x8170b0d
0x810af48
0x80cc59e
0x80c9d27
0x80cab2c
0x8115889
0x816bb3f
0x8169e47
0x830c0ff
0x833bb0a
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and
follow instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort…
thd->query at 0x7338b92b is invalid pointer
thd->thread_id=2
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.

paulbutcher · February 23, 2007, 12:51am

Paul,

Been watching this thread with interest, being a database person myself.

Asking for a database recommendation is a pretty dangerous thing to do :
DBMS
debates can get just about as heated as those over preferred editor or
IDE.

In many ways, the choice of DBMS comes down to two things -

Experience (folks choose what they know and are comfortable with)
Money (how much you are prepared to spend for the product and
support)

I am not a MySQL person myself (my preferred DBMS is IBM’s DB2, and for
that
my preferred operating platform is Linux). Â I have used it on occasion,
just
as I have been exposed to Oracle, MS SQL Server, Sybase, Informix and
IMS in
my years in the field.

My recommendation to you is that if you are generally comfortable with
MySQL
then that would be where you should stay. Â The obvious advice would be
to
raise a problem report with MySQL support. Â I would not advise anyone
who is
running a 24x7 database which is critical to the life of their business
to
run without some sort of support contract. Â I’m sure that MySQL support
would help diagnose your problem.

If you do want to consider a change, then you have the opportunity to
use
almost all of the databases currently available, albeit unsupported, for
free
these days. Â Â That’s true for DB2 (Express-C), Oracle (Express Edition)
and
SQL Server (Express Edition). Â But be aware that there all of these
free
editions have limitations on their use, some more restrictive than
others. Â
The basic details of these are -

DB2 Express-C : server can have maximum of 2 processors (these can be
multi-core however) and 4 gig of RAM. Â No limit on database size

Oracle Express : server can only have 1 processor (again multi-core) and
1 gig
of RAM. Â Database is limited to 4 gig in size. Â Only one database
allowed per
server. Â Also no 64-bit support (e.g. AMD64 or EM64T).

SQL Server Express : will only use one CPU on a server. Â 1 gig of RAM
only. Â
4 gig maximum database size. Â And, of course, only available on Windows

I think you will agree the DB2 Express-C is much less restricted than
the
others. Â Â I can thoroughly recommend it as an excellent platform for
24x7
operations. Â It runs beautifully on Linux, and especially on an x64-64
(AMD64 or Intel Xeon with EM64T) platform.

I’m giving a presentation in May at IDUG (http://www.idug.org) on using
DB2
with Ruby on Rails. Â It is nearly finished (the final version is due in
less
than a fortnight) and if any wants to help me review it then drop me
an
email. Â It goes through in some detail using the ibm_db2 adaptor from
IBM
Alphaworks. Â Â It doesn’t show you how to install DB2, since my audience
will
know this, but I can help there if you want : given a Linux box, an SSH
connection and a root password my record for going from tarball to
working
DB2 system, including sample database, is 12 minutes.

Anyway, as I said before : stay with what you know if you can resolve
the
issue. Â If you want to look elsewhere you could do a lot worse than DB2
!!!

HTH

Phil

paulbutcher · February 23, 2007, 3:16pm

You forgot to mention PostgreSQL, that’s the Open Source Oracle,
granddaddy of of many Open Source
and Commercial RDBMS

PS: Even knowing about Interbase off-spring I chose PostgreSQL when I
fleed from MySQL.

All the Best!
Sergey.

paulbutcher · February 23, 2007, 3:48am

Philip N. wrote:

In many ways, the choice of DBMS comes down to two things -

Experience (folks choose what they know and are comfortable with)

Money (how much you are prepared to spend for the product and support)

Phillip is dead on here.

I am not a fan of MySQL, but it works well for backing web pages with
mostly static data. When the Netfrastructure back end (a rewrite of
the firebird engine) is mainstream, it will move up at least one level
in my hierarchy of consideration.

DB2 and Oracle are good choices because you can vertically scale the
DBMS into the midrange and mainframe class systems (off of the WIntel
platforms), and both are robust with loads that are heavily
transactional. While both have free and low cost limited licenses,
they also have a variety of license packages that scale from
vertically handheld to mainframe, and robust clustering options in the
commodity, midrange, and large systems. I have found DB2’s support to
be generally superior to Oracle’s.

Firebird (my favorite) was originally architected for the Vax cluster
as a direct competitor for Oracle. Its “classic” architecture has
remained constant and a good fallback throughout its 30 year (more or
less) history. The latter half of its life time has been focused on
the commodity systems (WIntel), but there are builds for Windows CE,
Intel64, AMD64, and PowerPC architectures also, so it scales
vertically at least to the midrange systems. I have not checked the
status of Firebird 2, a ground-up rewrite (different from the
NetfraStructure rewrite), recently. However, it brings firebird’s
performance in line with MySQL and Postgres,without sacrificing
transactional integrity.

paulbutcher · February 23, 2007, 10:03pm

I am not a replication expert, but I have been researching a mysql
replication setup over the last few days

Seems that the issue with replication is that it sends the same queries
over as queries, so any issues revolve around why the same query might
not do the same thing at each end.

The summary of advice seems to be:

Use LVM - then you can snapshot your database in just a few seconds
and resetup your slave in slow time
Use InnoDB wherever possible. Issues with myisam particularly seems
to revolve around transactions or failed queries - they will succeed at
one end, but not the other end of the link…
Setup a process to compare all the tables regularly, ie spot when sync
has failed somehow…
Get a support contract with MySQL - if the problem happens again they
can help you resolve it, and possibly fix the bug
Perhaps get the Innodb hotbackup thing - this allows you to do online
backups and recreate your slave DB without taking that master down

http://www.xaprb.com/blog/2007/01/20/how-to-make-mysql-replication-reliable/
http://dev.mysql.com/doc/refman/5.0/en/replication-features.html

Good luck

Ed W

paulbutcher · February 23, 2007, 9:11pm

Thanks for all your help and suggestions guys. We certainly will
continue to chase down the replication failure, but we’re also going to
investigate our options regarding switching to alternative databases (or
upgrading to enhanced versions of MySQL, or sticking with what we’ve got
if we can find a way to make it work!)

And if anyone has any additional suggestions for avenues to investigate
to diagnose the replication issue, I’d be very (very!) grateful.

Cheers!

paul.butcher->msgCount++

paulbutcher · February 23, 2007, 10:39pm

Ed Wg wrote:

The summary of advice seems to be:

Thanks Ed - that’s a fantastic set of pointers - you can be sure that
we’ll read them with interest!

I really appreciate your help,

paul.butcher->msgCount++

paulbutcher · February 23, 2007, 11:30pm

Sergey,

The only reason I didn’t mention PostgreSQL is that I’ve not personally
got
any experience of it (not that I wouldn’t like to, but there are only so
many
hours in the day …).

The vibes I get are that if you want Open Source, then PostgreSQL is
more
advanced than MySQL.

As far as the “grandaddy”, Oracle may have been first to market (and
suffers
still for that in some ways, since things done in haste sometimes have
problems) but the real “daddy of them all” is mainframe DB2 (now known
as DB2
for z/OS). The experimental precursor to that was System-R, which was
built
on the back of the IBM researchers who invented relational technology.

DB2 for z/OS is 25 years old (since it’s first generally available
release).

Of course, if you want to go for a “real man’s database” still under
active
development today, you’d have to go for IMS/DB (hierarchical database).

Still work with it on a daily basis, and we still run millions of
transactions per day against it. It was developed as part of the Apollo
space program in the 60s.

Phil

paulbutcher · February 24, 2007, 1:03am

While PostgreSQL is more advanced, I believe that MySQL has better
clustering support. Slony-I seems like it would be ok for a
master-slave(s) setup, but pgcluster doesn’t seem ready for prime
time… and I believe it adds a 5x overhead.

-carl

On 2/23/07, Philip N. [email protected] wrote:

As far as the “grandaddy”, Oracle may have been first to market (and suffers
transactions per day against it. It was developed as part of the Apollo

PS: Even knowing about Interbase off-spring I chose PostgreSQL when I

Money (how much you are prepared to spend for the product and
DBMS into the midrange and mainframe class systems (off of the WIntel
less) history. The latter half of its life time has been focused on

–
EPA Rating: 3000 Lines of Code / Gallon (of coffee)

paulbutcher · February 24, 2007, 12:11pm

Thanks for that. It should be finished sometime this week.

Phil

paulbutcher · February 24, 2007, 9:45am

Thanks Philip for a very comprehensive summary of various ideas. I’ve
toyed with the idea of using DB2 for Rails though haven’t got around to
it. But your mail does push me to consider revisiting it again.

I’d like to get my hands on a copy of your presentation, and if no one
else volunteers to review it, put me on the list (my reluctance is due
to my lack of experience rather than lack of willingness).

Cheers
Mohit.

paulbutcher · February 24, 2007, 12:30pm

Hey Philip, I would be very interested in reviewing your presentation.
Please keep me posted and I’ll talk to you later.

Peace,

-Conrad