Does any module support the heathy check of the backend upstreams?

hi,
I want to implement a module that can check the heathy of the backend
web
servers circlely. If the upstream died, the module can remove the server
from the upstream list and if the upstream get up, the module can add
it to
the list.
Is there a good way to implement it?

thank you
Chieu

On Sun, May 31, 2009 at 6:56 PM, Chieu [email protected] wrote:

hi,
I want to implement a module that can check the heathy of the backend web
servers circlely. If the upstream died, the module can remove the server
from the upstream list and if the upstream get up, the module can add it to
the list.
Is there a good way to implement it?

thank you
Chieu

I think it’s already be implemented in nginx

See Module ngx_http_upstream_module

The server directive has tow args: max_fails fail_timeout

I don’t think that removes it and readds it. That sounds like he wants
healthchecking (I do too) and currently it sounds like the answer is
"do it outside of nginx, and then add the servers to the nginx config
/ include file as they come up and down and then restart/reload nginx)

If nginx exposed an API of sorts to dynamically add/remove servers
from the upstream, this could be done via an external process without
having to manually create some upstreams.conf file and HUP the server
each time.

If nginx exposed an API of sorts to dynamically add/remove servers
from the upstream, this could be done via an external process
While I see nothing wrong with this in general…

without
having to manually create some upstreams.conf file and HUP the server
each time.
What is so wrong with this? Also, just for clarity (though I think
you meant it too) the external process can manage the upstreams.conf
and HUP nginx automatically! You might even be able to get something
like this going pretty quick with some small modifications to monit or
something similar!

– Merlin

This is something I would like very much too. But I’m not sure if
doing it inside nginx is the best place. Maybe it is. I wouldn’t want
it to impact the performance of nginx though.

Eg, I currently have around 40 backend servers. If I want to check
their health every 5 seconds, that’s 8 checks a second (not much).
But if I had 200 backend servers, that becomes a lot more checking
nginx has to do every second

How often do you really expect servers to go up and down? I think you
are correct, though, HUP can take a bit of time/resources. My point
is, are you really having upstreams die constantly? Seems like you
would have much worse problems than what it takes to HUP at that
point…

– Merlin

On Wed, Jun 10, 2009 at 2:21 PM, merlin corey [email protected]
wrote:

What is so wrong with this? Â Also, just for clarity (though I think
you meant it too) the external process can manage the upstreams.conf
and HUP nginx automatically! Â You might even be able to get something
like this going pretty quick with some small modifications to monit or
something similar!

It just seems a bit messy to HUP the server which probably does a lot
more work than updating an upstream array/construct.

simplely, I get the health of the upstreams using the
ngx_http_upstream_rr_peer_t::fails and the
ngx_http_upstream_rr_peer_t::max_fails. if fails < max-fails, I think
the
server died, else I think the server has got up.

2009/6/11 Michael S. [email protected]

On Wed, Jun 10, 2009 at 2:45 PM, merlin corey[email protected]
wrote:

How often do you really expect servers to go up and down? Â I think you
are correct, though, HUP can take a bit of time/resources. Â My point
is, are you really having upstreams die constantly? Â Seems like you
would have much worse problems than what it takes to HUP at that
point…

In an infrastructure with 10’s or 100’s of servers, in theory you
could have one going up and down anytime.

Look at Amazon’s whitepaper about Dynamo or how Google addresses the
whole “commodity” issue. Things will go up and down at anytime, and
you should gracefully handle it. nginx is almost capable of gracefully
doing it (mid-transfer I don’t think it would unless the client
re-issued the request with a range offset) but with the
try-next-upstream approach it gracefully handles that already…

I’m looking to have a solution in place which can scale and is “set it
and forget it” - a HUP may be a lot of work, especially if nginx is
being the frontend for so many connections/servers. I don’t know. I
guess Igor/Maxim would be the most knowledgeable about what exactly a
HUP will do to all of that…