Well, it might be, depending on the timing of the heartbeat and
whether/when a particular request causes Nginx to try that backend.
Right. That is my point. “might” doesn’t cut it for our high end
availability needs. We need to KNOW. When nginx takes an upstream
server out of rotation, I need it to tell me. That takes all the
guesswork out of it.
Bottom line is that it doesn’t make any difference whether a
monitoring script says an upstream server is down or not. What
matters is whether nginx considers it down or not. And for me to know
that, nginx needs to tell me.
But it does. It’s in your error logs. There are alternate loggers that
can even allow you to have scripts run when a regex is matched (metalog
for one). I’ve used metalog successfully to deter brute-force ssh
attacks for example.
Thanks. I had thought of an approach like this. It is far from ideal
but better than nothing. I still think that with the high quality of
nginx and it being all about performance, it is a mandatory feature to
have it tell us when a node is down and therefore performance of my app
is likely affected.
In addition to it being not a great solution to have to monitor the log,
the other short coming is that there is no basic way for me to know
whether max_tries has been tripped. For instance if I have max_tries at
3 and fail_timeout at 60s, then I have to see if metalog warns me 3
times within 60s, and that assumes it can do this in real time and
keeping up with the log output. Then my script needs to keep track of
whether metalog flagged me 3 times in 60s, etc. Just gets messy.
Except that Nginx is asynchronous, not threaded. This means that when
your script is called, Nginx will now be delayed while the script is
launched (and what if the script fails?). You might be able to work
around this, but I suspect it won’t be as trivial as you might hope.
I will address this in my reply to Manilo’s message. Thank you.
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile. Try it now.