Hello, We're interested in hearing from folks who have worked out a set of "best practices" for monitoring mongrels with monit. My local setup is remarkably similar to the one currently found in the " Practical Programmers Deploying Rails Apps Beta" PDF, as well as several other examples that we've seen floating around on the web. Has someone gone to the trouble to do a pro/con analysis of a couple of different approaches, so that we can get a better idea of what folks have tried? [We'd be sort of miserable if we didn't have monit... it allows us to build some beautiful things!] thanks very much in advance, --elijah
on 05.03.2008 03:43
on 09.03.2008 06:31
Here is a good recipe for a mongrel process, as provided on some site I found: (I believe the forum post was by Ezra) check process mongrel_cluster_8000 with pidfile /var/apps/twocb/shared/pids/mongrel.8000.pid start program = "/usr/bin/mongrel_rails cluster::start -C /var/apps/twocb/current/config/mongrel_cluster.yml --clean --only 8000" stop program = "/usr/bin/mongrel_rails cluster::stop -C /var/apps/twocb/current/config/mongrel_cluster.yml --clean --only 8000" if totalmem is greater than 110.0 MB for 4 cycles then restart # eating up memory? if cpu is greater than 50% for 2 cycles then alert # send an email to admin if cpu is greater than 80% for 3 cycles then restart # hung process? if loadavg(5min) greater than 10 for 8 cycles then restart # bad, bad, bad if 20 restarts within 20 cycles then timeout # something is wrong, call the sys-admin group mongrel_cluster
on 26.03.2008 22:55
Does anyone have any tips for checking to see if the mongrel instance
is responding to HTTP? I just tried doing this:
if failed host 127.0.0.1 port 8001 protocol http
with timeout 10 seconds
then restart
but when I have that code in my monit script it gives me this error:
mongrel_rails_8001' failed protocol test [HTTP] at INET[127.0.0.1:8018]
via TCP
Thanks for any tips.
Mike
on 26.03.2008 23:53
well, something doesn't seem right:
-> mongrel_rails_8001' failed protocol test [HTTP] at
INET[127.0.0.1:8018]
^
^^^^^^^^^^^^
127.0.0.1:8018 <- port 8018? or do you want port 8001?
also i leave the host parameter out of the monit config.
if failed port 8015 protocol http
with timeout 20 seconds
then restart
on 27.03.2008 00:10
Sorry - I was copying and pasting from an alert email. The ports
"do" match up in my monit script
Here's a real example:
#8002
check process mongrel_rails_8002 with pidfile
/var/run/mongrel/mongrel.8002.pid
group mongrel
start program = "/usr/local/bin/ruby /usr/local/bin/mongrel_rails
start -d -e production -a 127.0.0.1 -c /opt/rails/radius --user
mongrel --group mongrel -p 8002 -P /var/run/mongrel/mongrel.8002.pid
-l /var/log/mongrel/mongrel.8002.log"
stop program = "/usr/local/bin/ruby /usr/local/bin/mongrel_rails
stop -P /var/run/mongrel/mongrel.8002.pid"
if totalmem > 100.0 MB for 5 cycles then restart
if failed host 127.0.0.1 port 8002 protocol http
with timeout 10 seconds
then restart
on 27.03.2008 03:47
Well, there are a few things to think about. 1. I've got a script that checks nginx with a "success" message. 2. Are you checking if mongrel is up, or rails? If it's mongrel, you might want to write a custom route so it's not loading the framework stack every time. If it's rails, you should create something like a MonitController that will respond with "success" when asked for the index action. Doing those things has worked quite well for me so far. Just be sure to disable the logging in the monit controller so it doesn't fill your logs. def logger end On Wed, Mar 26, 2008 at 6:09 PM, Michael Engelhart
on 27.03.2008 17:57
Thanks Joey - I have my apache setup to respond "OK" to monit to test my Apache instances but how do I write my monit script to hit the MonitController.index action? Is it the same config as I would use for Apache? Thanks Mike
on 27.03.2008 19:53
here's a blog post on suppressing the logging... "http://www.ruby-forum.com/topic/95146" here's my bit from my monitrc ****************** if failed port 8000 protocol http and request "/pulse" with timeout 10 seconds then restart ******************* where "pulse" is defined in my routes.rb
on 28.03.2008 21:31
Thanks guys - this works great for me. Much appreciated. Mike