A little extra stability

Gary_S · June 21, 2006, 12:38pm

I’m on Dreamhost and this definitely works for me, but it might be
worthwhile if you try it on your own hosting provider. I’ve said for
a while now that most of the time I don’t have problems with Typo …
but with Rails and fastcgi and everything else playing nicely together.

I noticed this in the Dreamhost wiki:

“Dreamhost regularly kills off sleeping processes with their
watchdog. This will kill your dispatch.fcgi processes, leading to
Error 500s from time to time. You’ll need to make dispatch.fcgi
ignore all TERM requests by changing how it responds to them.”

Sounded very familiar to me so I put the following in dispatch.fcgi
(after require ‘fcgi_handler’):

class RailsFCGIHandler
private
def frao_handler(signal)
dispatcher_log :info, “asked to terminate immediately”
dispatcher_log :info, “frao handler working its magic!”
restart_handler(signal)
end
alias_method :exit_now_handler, :frao_handler
end

RailsFCGIHandler.process!

I’d had that code in last summer, and it’d been replaced in an
upgrade and I’d never put it back. To be honest with you after the
release of Rails 1 (and rake and something else I think) I was under
a misconception that I wouldn’t need it anymore. But it really has
made a massive difference to the number of errors I get.

Dare I say it? … the install is stable. My log sizes have dropped
by 70% at least. In fact looking in the fastcgi crash log I can see
that it’s working:

[21/Jun/2006:01:49:43 :: 4123] asked to terminate immediately
[21/Jun/2006:01:49:43 :: 4123] frao handler working its magic!
[21/Jun/2006:01:49:43 :: 4123] asked to restart ASAP

May be well worth putting that into dispatch.fcgi … can’t see it
hurting.

Gary

Gary_S · June 21, 2006, 2:49pm

I can confirm this. I’m also running typo at dreamhost.

I use feedburner and as you probably know feedburner warns you when
there is
something wrong with your sourcefeed. So in fact you know when there’s a
problem with your blog.

Before I had about 20+ messages/warnings a day and since I added the
piece
of code below I see about 2 to 3 messages a week or so. Big improvement
if
you ask me.

Gary_S · June 21, 2006, 3:20pm

On 21 Jun 2006, at 13:47, Koen Van der Auwera wrote:

I can confirm this. I’m also running typo at dreamhost.

That’s good to know, not just me then.

I use feedburner and as you probably know feedburner warns you when
there is something wrong with your sourcefeed. So in fact you know
when there’s a problem with your blog.

Heh, actually I didn’t know. But I do now. People trying to leave
commments email me and I check logs … I’ll look at that though.
But it’s a good point to note that a 500 error can be thrown up when
somebody tries to leave a comment … and Feedburner won’t let you
know about that … will it?

Before I had about 20+ messages/warnings a day and since I added
the piece of code below I see about 2 to 3 messages a week or so.
Big improvement if you ask me.

Agreed … it’s a big relief.

G

Gary_S · June 21, 2006, 6:37pm

On the other hand, this may just make them (DreamHost) angry because
they
don’t want a lot of long-lived (rather large) processes hanging out on
their
systems. The 500s are because (a) the FCGI connector isn’t good about
detecting failed connections ahead of a request coming in asking to use
the
connection and (b) Rails can’t really get itself restarted in a short
enough
time to get there before the timeout.

It’s also worth noting that if you’re capturing SIGTERM, you’re
capturing
your chance to for your process to die while cleaning up its mess.
Normally
the “please restart” signal is SIGHUP. If someone has to do maintenance
on
these systems and kill -TERMing your processes doesn’t work it’s going
to
head right on to kill -9 which may or may not be a desirable state of
affairs.

Gary_S · June 21, 2006, 3:48pm

I also have that code in my Rails apps on DH.

It works so far… seems kinda hokey, but if it works for now.

Gary_S · June 21, 2006, 8:37pm

On 21 Jun 2006, at 17:34, Andy Carrel wrote:

On the other hand, this may just make them (DreamHost) angry
because they don’t want a lot of long-lived (rather large)
processes hanging out on their systems. The 500s are because (a)
the FCGI connector isn’t good about detecting failed connections
ahead of a request coming in asking to use the connection and (b)
Rails can’t really get itself restarted in a short enough time to
get there before the timeout.

Well DH are well aware of this fix and I’m sure if you were running
RoR code that caused issues then they’d let you know, and like I said
I ran this for months last year when they were new to RoR and were
more likely to err on the side of caution. It’s not keeping
processes alive, it’s just restarting instead of exiting - unless
I’ve misunderstood something…

It’s also worth noting that if you’re capturing SIGTERM, you’re
capturing your chance to for your process to die while cleaning up
its mess. Normally the “please restart” signal is SIGHUP. If
someone has to do maintenance on these systems and kill -TERMing
your processes doesn’t work it’s going to head right on to kill -9
which may or may not be a desirable state of affairs.

I don’t understand what the concern is here. In fact once a day at
an ungodly hour I kill -9 the fcgi processes … I think it’s good
housekeeping.

All in all it’s stabilizes the environment for Typo and I haven’t
heard any problems with it … only good things. I don’t quite get
what concerns you have … but server set-up and fcgi playing nicely
is still a dark(ish) art to me.

G