Properly restarting mongrel instances

Hi folks.

Using mongrel_rails and the mongrel_cluster capistrano recipes, I
often encounter a situation where some of the mongrel processes don’t
die in time to be restarted. The output of capistrano will tell me
something like “mongrel on port 8001 is already up”, but that’s only
because capistrano/mongrel_rails failed to take it down in the first
place.

The solution is to do a full deploy:stop a couple times to make sure
they are all down, and then do a deploy:start.

Is my problem typical? Is there a solution? Seems like mongrel_rails
and/or the capistrano recipes should wait for the processes to stop
before attempting to restart them.

Thanks for any insight,
John


John Joseph B.
http://blog.johnjosephbachir.org
http://lyceum.ibiblio.org

http://jjb.cc

We run into this problem a lot as well. The problem can be exacerbated
when a mongrel has a backlog of work, or has bloated to a point that
it is heavily swapped. The mongrels always get the shutdown signal,
but they don’t act on it fast enough to clear their pid file by the
time the start is actioned.

In our case those mongrels will eventually quit and monit will restart
them, but its not ideal.

If cluster::restart supported a --delay parameter that would go some
way to fixing the problem.

Cheers

Dave

This patch to mongrel cluster adds a check wait between each start/stop:

http://rubyforge.org/tracker/download.php/1306/5147/15427/2761/rolling-restart.patch

“check wait” is defined as:
(1) stop the port
(2) check if it’s really dead
(3) wait 1 second and check again if it’s not
(4) wait 10 seconds if (3) fails and send a force quit

It also does a “rolling restart” stopping and restarting each mongrel
one at a time rather than taking down the whole batch (good for lb
setups).

I’ve been using in production on 20 servers (160 mongrels total) via a
monkey patched mongrel_rails script for awhile now with good effect
(ymmv). However, it’s not been accepted into any mongrel cluster
releases yet because I’ve heard they’re revamping the whole package.

I am running a number of Rails apps on a quite powerful server (dual
quad-core xeons, 8G ram, raid 10) running FreeBSD. I’m using a fairly
simple software stack: Apache22, mod_proxy, and a single mongrel
instance for each website. Apache is serving all static content.

These websites are not MySpace, or anything like it. They are typical
small-business websites that get a few thousand, not millions, of
pageviews per day.

The websites are extremely fast-loading, apparently very stable
(nothing has failed in the month or so since I’ve switched over from
a FastCGI setup), and I love the simplicity.

My question: This was supposed to be a first step towards using a
mongrel cluster, but the single mongrel instance seems to work
perfectly fine. Can I keep using it, as long as the loads stay at
modest levels? I don’t want to move to a more complex set up just
because it would be cool or fun to do. If a single instance will do
the job, then simple is better, IMHO.

Am I running any risks with this set up? The one I can think of is
there is no redundancy: if that single mongrel instance fails, the
site is down. Has anyone tried monitoring the single instance, and
restarting it if it fails? Is there anything else I should be
worrying about?

Any advice, much appreciated.

Thanks: John

On Jan 22, 2008 7:46 AM, John A. [email protected] wrote:

I am running a number of Rails apps on a quite powerful server (dual
quad-core xeons, 8G ram, raid 10) running FreeBSD. I’m using a fairly
simple software stack: Apache22, mod_proxy, and a single mongrel
instance for each website. Apache is serving all static content.
.
.
.
The websites are extremely fast-loading, apparently very stable
(nothing has failed in the month or so since I’ve switched over from
a FastCGI setup), and I love the simplicity.

I concur.

My question: This was supposed to be a first step towards using a
mongrel cluster, but the single mongrel instance seems to work
perfectly fine. Can I keep using it, as long as the loads stay at
modest levels? I don’t want to move to a more complex set up just
because it would be cool or fun to do. If a single instance will do
the job, then simple is better, IMHO.

You can absolutely keep running things that way. I’ve ran the same
sort of sites for years, and the vast majority of them have been done
is exactly that way. It works just fine, and IMHO, more people should
be deploying Rails apps in that sort of simple manner.

Am I running any risks with this set up? The one I can think of is
there is no redundancy: if that single mongrel instance fails, the
site is down. Has anyone tried monitoring the single instance, and
restarting it if it fails? Is there anything else I should be
worrying about?

That’s about it. My sites aren’t Rails sites, but having a site down
because of an instance failing just isn’t something that happens
unless one has a bug in one’s code, in my experience. To be perfectly
safe, just use monit or something similar to keep an eye on your
processes, as you mentioned. Depending on your code, you should be
about to handle quite a bit of traffic on each site without having to
worry about switching them to a clustered configuration.

Kirk H.

Hi John,

Am I running any risks with this set up? The one I can think of is
there is no redundancy: if that single mongrel instance fails, the
site is down. Has anyone tried monitoring the single instance, and
restarting it if it fails? Is there anything else I should be
worrying about?

We use Monit to monitor our instances and Monit both restarts and
reports via email if anything fails. We schedule Monit to check every
60 seconds so having an instance fail is not really noticed.

Matthew L.ham


Matthew L.ham
Geschäftsführer / Managing Director
Tel: +49 (0) 5251 6948 293
Mobile: +49 (0)172 5749305

Indiginox GmbH
Frankfurter Weg 6, 33106 Paderborn, Germany
HRB Paderborn 8130
Geschäftsführer / Managing Directors:
Matthew L.ham / Ashley Steele

Kirk & Mathew,

Thanks for confirming that this config makes sense. I’m going to
stick with it until one of my clients needs more.

Also, thanks for the lead on monit. I’d been using a home-brew
restart script, but a quick look at the surprisingly good monit docs
make me think this is a tool I’ve been looking for.

Mongrel is great, BTW. One of those programs that just works.

Thanks: John

On 22 Jan 2008, at 21:57, John A. wrote:

Also, thanks for the lead on monit. I’d been using a home-brew
restart script, but a quick look at the surprisingly good monit docs
make me think this is a tool I’ve been looking for.

John,

You may also like to consider the modestly named God:

http://god.rubyforge.org/

The biggest difference from Monit is that its configuration is
written in Ruby.

With regards,
Andy S.

On Jan 22, 2008 10:07 AM, Kirk H. [email protected] wrote:

I concur.
is exactly that way. It works just fine, and IMHO, more people should
be deploying Rails apps in that sort of simple manner.

This is an interesting discussion. The conclusions are a bit confusing
to me. John-- you say that your sites get a few thousand hits a day.
With only one mongrel, can’t the system only serve one request at a
time? It seems like, independent of system performance, just the fact
that the requests have to be done sequentially would have a big hit on
the performance on the site.

Is this not the case, that with one mongrel, only one request can be
served at a time?

John


John Joseph B.
http://blog.johnjosephbachir.org
http://lyceum.ibiblio.org

http://jjb.cc

On Jan 29, 2008 10:02 AM, John Joseph B.
[email protected] wrote:

This is an interesting discussion. The conclusions are a bit confusing
to me. John-- you say that your sites get a few thousand hits a day.
With only one mongrel, can’t the system only serve one request at a
time? It seems like, independent of system performance, just the fact
that the requests have to be done sequentially would have a big hit on
the performance on the site.

Is this not the case, that with one mongrel, only one request can be
served at a time?

It’s not quite as cut and dried an issue as that because while there
are a variety of threaded, nonthreaded, and event based combinations
among different frameworks and app container choices, Ruby 1.8’s
threading is via green threads anyway

However, for sake of argument, let’s just say that the answer is
“yes”. Only one request can be served at a time.

It becomes a question of how long it takes for an application to serve
a request. And for, I think, the vast majority of the typical
business oriented dynamic web site, if properly built, the answer
should be “not long at all.”

My absolute slowest sites, on my slowest server, still have a capacity
of 6-10 requests/second, and I wouldn’t tolerate that performance if I
were building those sites today instead of 5 years ago.

6 requests per second means that if it is pushed to it’s limit for
even one hour, it has served more than 21000 requests in that hour.
Even if it were a lethargically slow 1 request/second, that’s a
capacity of 3600 requests in an hour, and that capacity easily meets
the needs of the common business site.

The fastest sites on this same slowest server that are 100%
dynamically rendered, and which perform database transactions with
every request can turn about 130 r/s, peak. This is a several year
old 32 bit Athlon based Linux box proxying through Apache.

And just to prove a point, here are a couple timings from a couple
different sites on one of my faster servers, which is a newer 3.0 Ghz
Xeon based Linux box proxying through Swiftiply:

First, a “normal” business site, with all content and navigation
dynamically rendered (it is managed by a CMS and stored in a database
table):

Requests per second: 436.87 [#/sec] (mean)

Second, a “fast” site. All of the requests are still dynamic template
renderings, but it has been tuned to be fast:

Requests per second: 1167.21 [#/sec] (mean)

Even if one’s app is so slow that it can only render a handful of
pages per second, there is still a vast, expansive forest of sites and
apps where that performance, on a single process, is more than
adequate for even the heaviest traffic that the site will get. That
performance will still serve thousands to tens of thousands of people
per day.

Kirk H.

This ‘god’ thing looks promising. :slight_smile:
I’m curious if anyone has seen anything similar for a windows
environment
(rails monitoring software for windows). I’d be interested to hear any
other
suggestions for more generalized monitoring software for windows
servers.
Open source is preferable, but I might consider a commercial product if
it
seems like a good fit. Has anyone heard anything about this:
http://fiveruns.com/products - seems to be the only google result.
Sorry this is kind of off-topic. Please feel free to e-mail me privately
so
this won’t spam the rest of the list and I’ll post the replies later for
anyone who is interested. Also I’m using Mongrel, so I can tie it in
that
way. :wink:

Cheers, Stephen

On Jan 29, 2008, at 12:43 PM, Kirk H. wrote:

should be “not long at all.”

Kirk beat me to the answer, but my answer is the same.

While this might be a theoretical problem, it’s not an actual
problem. Particularly since mongrel is only serving up the first
request of each page view – the HTML. Apache handles subsequent
requests for images, javascript, css files, etc.

So even in the rare case of simultaneous page requests, the second
request would only have to wait milliseconds.

Again, this wouldn’t work for eBay, but it works fine for my size
clients.

– John


Mongrel-users mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/mongrel-users

On Jan 29, 2008, at 2:19 PM, Erik Hetzner wrote:

request would only have to wait milliseconds.

Again, this wouldn’t work for eBay, but it works fine for my size
clients.

A word of caution. If you are do not test your app in a setup in which
requests are not sequential, you will (will, not might) run into
concurrency issues if you need to transfer to a multiple-process or
multi-threaded setup.

H’mmm… I’m afraid this is a bit over my head. Can you elaborate?

– John

John Joseph B. wrote:

Hi folks.

Using mongrel_rails and the mongrel_cluster capistrano recipes, I
often encounter a situation where some of the mongrel processes don’t
die in time to be restarted. The output of capistrano will tell me
something like “mongrel on port 8001 is already up”, but that’s only
because capistrano/mongrel_rails failed to take it down in the first
place.

The solution is to do a full deploy:stop a couple times to make sure
they are all down, and then do a deploy:start.

Is my problem typical? Is there a solution? Seems like mongrel_rails
and/or the capistrano recipes should wait for the processes to stop
before attempting to restart them.

Thanks for any insight,
John

Most of the responses assume that waiting for your mongrels to stop is
better than sending them the signal and continuing on with starting a
new batch of servers.

I don’t see a problem with this, unless the old processes finished off
any requests in the pipeline start picking up new requests…can anyone
verify that a “stop” command to a mongrel cluster will keep the
mongrel(s) that were sent the signal from serving new requests?

Assuming that is true, then it already would be a “rolling restart”,
from my understanding.

At 09:02 AM 1/29/2008, [email protected] wrote:

With only one mongrel, can’t the system only serve one request at a
time? It seems like, independent of system performance, just the fact
that the requests have to be done sequentially would have a big hit on
the performance on the site.

Is this not the case, that with one mongrel, only one request can be
served at a time?

John

Hi John,

Rails can only run one request at a time. So if you’re running a stack
that includes Mongrel and Rails, then, yes, Mongrel will be reduced to
blocking and waiting for Rails to return before sending another request
through the pipe (effectively making Mongrel single threaded too). If
you had a truly multi-threaded app framework, then mongrel would
happily support multiple calls simultaneously into it.

From my understanding Mongrel is like Apache and Nginx, in that if you
increase load on it, it will increase performance up to the limits of
the hardware upon which it runs.

Rails on the other hand has other limitations. :slight_smile:

Other people with more expertise may have better information but I
believe that’s basically the sitch.

Steve