Why would a DRb process shut down?

I have a program that runs about 10 DRb processes. Ocassionally one of
the processes will shut down and I will get the error telling me that it
can not connect to the process. It is this error:

Errno::ECONNREFUSED: Connection refused - connect(2)

What causes a DRb process to shut down? If an exception is thrown and
uncaught will that kill it? If the server has too much load will it die?

Any tips on keeping the processes running?

Any help is greatly appreciated. Thanks.

Ben J. wrote:

Any help is greatly appreciated. Thanks.

Do you have Thread.abort_on_exception=true ? If not, a thread with an
uncaught exception will die silently.

On Wed, 20 Sep 2006, Ben J. wrote:

Any tips on keeping the processes running?

i use a technique i call ‘immortal daemon’. rq and dirwatch both use
it. it
works like this:

  • daemon uses a well know lockfile to signal it’s existence. see my
    lockfile class for a nfs-safe impl. so, take a daemon named
    ‘foo_d’ for
    example. we’ll assume that it uses /home/$USER/.foo_d/lock as it’s
    lockfile.

  • the options given to the lockfile.lock call are to aquire the lock
    and
    fail immediately (vs waiting) if another process holds the lock.
    the
    lockfile class i wrote knows about dead processes and stale locks,
    so you
    needn’t worry about that.

  • cron your daemon to start every 15 minutes. if the process isn’t
    running,
    at any time for any reason, it will be restarted. this works after
    reboot
    and for other transient (network for example) errors.

i have used this approach in production systems with great success for
several
years. it’s robust and does not even require root privs.

regards.

-a

unknown wrote:

On Wed, 20 Sep 2006, Ben J. wrote:

i have used this approach in production systems with great success for
several
years. it’s robust and does not even require root privs.

regards.

-a

Can you attach some code that I could use? I’m still kind of lost, but
it sounds good. I really appreciate your help.