Resilience to database restart


#1

Hi, apologies if you’ve discussed this already, but I haven’t seen it:

I’m using rails in a multiserver environment. There is one central
server which is used to run Postgres and Rails. Then there are several
satellite servers that are supposed to pick up tasks from the Postgres
database and execute them. The servers use ActiveRecord to connect to
the database, even though they don’t run Rails. It’s a rudimentary
distributed computed facility. Each satellite server runs a daemon that
checks the central DB for new tasks, and if it finds one, spawns a new
thread to execute that task.

Here’s the problem: the central Postgres server is rebooted weekly.
When this happens, the satellite servers crash when they try to find
new tasks, and take down the threads they had spawned and were still
active.

My idea was to catch all exceptions in the daemon processes, and
periodically try to reconnect, however it seems overkill to wrap every
statement in a begin-rescue block.
I was wondering if someone has a more elegant solution to suggest.

Also I’ve read about *ActiveRecord::Base.verification_timeout *and was
hopeful, but I don’t think it’s suited for this case.

Any suggestions will be appreciated.
Thanks
Jaime