High volume DRb server errors

I’ve run into an interesting problem using DRb as the frontend for
publishing one of our services. We have a single Ruby process publishing
a service via DRb. The service is consumed by 100-300 clients at any
given time.

When the server is first started, things seem to go fine. However, after
a while, it appears that things get out of hand. The number of threads
goes from ~250 (1 per client) to ~400. During this period, the server is
still responsive to some requests, though they are quite slow.

Soon after the number of threads tops 400, it turns a corner and just
goes wild all the way up to 1024 threads. The rate of growth between 400
and 1024 is very fast. Once it reaches 1024 threads, it tops out. This
is strange because the clients are continuing to attempt to make
requests, they just aren’t being processed anymore.

If, once the server has died, I kill all the clients, eventually, the
thread count will reduce most of the way down, but stop short of the
bottom. Most often, the thread count seems to reach 64, and won’t go any
lower. At this point, the ruby process is using 100% of the cpu, though
what it is doing is anyone’s guess. It will not become responsive again
until it is restarted. There are also quite a lot of TCP sockets hanging
out in the CLOSE_WAIT state on the server.

I have attached two sample files I’ve used to recreate this error in a
vacuum. Run test_server.rb, then do “ruby test_client.rb start 50” to
pretty quickly max out the number of threads on the server.

Has anyone encountered this kind of problem before?

It’s also worth noting that DRb has no way at all of protecting itself
from too many connections - it will gladly run into the max allowed open
file handles limit and not handle the error correctly. (I increased the
open file handle limit on my production system because I thought that
was the issue. It appears not to be.) Should DRb be extended to have
more safe server features, or is it just not intended to fill that use?

Could anyone recommend a better, lightweight, robust Ruby RPC library?