I've written a plugin that can health check nginx backends, which
everyone is free to use. This is similar to the healthchecking features
that varnish and haproxy support. Here's a sample config[1] just to
give you an idea, that uses the upstream_hash plugin. You can get the
code here [2] and an example of how to patch upstream_hash here [3].
The plugin is actually an optional feature that other upstream plugins,
like upstream fair or iphash for example, can plug into and use. To use
it, their code needs to be modified to also check the health of the
backend.
This plugin is super beta, so please be careful. Feedback/patches
welcome.
[1]
upstream test_upstreams {
server localhost:11114;
server localhost:11115;
hash $filename;
hash_again 10;
healthcheck_enabled;
healthcheck_delay 1000;
healthcheck_timeout 1000;
healthcheck_failcount 1;
healthcheck_expected 'I_AM_ALIVE';
healthcheck_send "GET /health HTTP/1.1" 'Host: www.mysite.com'
'Connection: close';
}
[2]
http://github.com/cep21/healthcheck_nginx_upstreams
[3]
http://github.com/cep21/nginx_upstream_hash/tree/support_http_healthchecks
on 2010-02-27 02:26
on 2010-03-01 11:49
On Feb 26, Jack Lindamood wrote: >This plugin is super beta, so please be careful. Feedback/patches >welcome. I have been wanting to write something similar from a long time, so thanks for getting started. Does the health check compete with the existing logic to mark an upstream as up/down? Here is a scenario: The real traffic goes to this upstream url "/service/login". My health check url is configured as "/hc". Now /hc is always available by /service/login is thowing up a lot of errors like timeouts, 500 etc. etc for a given upstream server. What will the status be eventually marked as?
on 2010-03-02 08:58
It's up to the logic in your upstream. The module just runs the health check. Your upstream module will have to be changed to query the status of the healthcheck and decide if it should continue anyways, or pick another upstream server. Posted at Nginx Forum: http://forum.nginx.org/read.php?2,57810,58927#msg-58927
on 2010-03-02 09:06
Hi, (piggybacking on Arvind's mail because I don't seem to have the original post). On Mon, Mar 01, 2010 at 04:18:32PM +0530, Arvind Jayaprakash wrote: > > > >This plugin is super beta, so please be careful. Feedback/patches > >welcome. Did you look at ngx_supervisord[1]? It serves a roughly similar purpose (more direct interaction of load balancers with the outside world) and it also requires patches to the load balancer. So maybe we could kill two birds with one stone and use the same API. [1] http://labs.frickle.com/nginx_ngx_supervisord/ Best regards, Grzegorz Nosek
on 2010-03-02 09:35
> Did you look at ngx_supervisord[1]?
Do you mean use only ngx_supervisord? From what I can tell, it requires
a separate daemon running on the machine that does the healthcheck,
which then sends a call to nginx via supervisord to turn servers off or
on. Is that correct? The goal was to build the checking into nginx so
that we don't have to monitor another process, making the system more
stable. If nginx is running, then you're guaranteed the health checks
are running.
Or did you mean call into ngx_supervisord when a healthcheck fails? I'm
happy to integrate other APIs if you have some ideas.
Posted at Nginx Forum:
http://forum.nginx.org/read.php?2,57810,58940#msg-58940
on 2010-03-02 10:09
On Tue, Mar 02, 2010 at 03:34:36AM -0500, cep221 wrote: > > Did you look at ngx_supervisord[1]? > > Do you mean use only ngx_supervisord? From what I can tell, it requires a separate daemon running on the machine that does the healthcheck, which then sends a call to nginx via supervisord to turn servers off or on. Is that correct? The goal was to build the checking into nginx so that we don't have to monitor another process, making the system more stable. If nginx is running, then you're guaranteed the health checks are running. > > Or did you mean call into ngx_supervisord when a healthcheck fails? I'm happy to integrate other APIs if you have some ideas. I mean using the same API so that patching a balancer for ngx_supervisord support would automatically make it support your healthcheck module. This means that healthcheck would get passed a callback which should be executed with specified parameters when it detects a failed backend. See: http://github.com/FRiCKLE/ngx_supervisord/blob/master/patches/ngx_http_upstream_round_robin.patch (grep for ngx_http_upstream_backend_monitor). I don't expect you to use ngx_supervisord itself (or supervisord), that would indeed be quite pointless :) @Piotr: if the health check plugin adapts to ngx_supervisord API, I guess the API spec could use s/_supervisord//ig, no? A bunch of #defines should suffice. Best regards, Grzegorz Nosek
on 2010-03-02 11:21
> Do you mean use only ngx_supervisord? From what I can tell, it requires a > separate daemon running on the machine that does the healthcheck, which > then sends a call to nginx via supervisord to turn servers off or on. Is > that correct? No, it's not :P Neither ngx_supervisord nor supervisord do any healt-checking. ngx_supervisord communicates with supervisord (process manager) to dynamically start or stop backend servers, depending on the load. Since version 1.3 it also supports supervisord-less configuration which disables all communication with supervisord daemon (so it basically takes backend servers out of rotation without the need to reload nginx). > Or did you mean call into ngx_supervisord when a healthcheck fails? I'm > happy to integrate other APIs if you have some ideas. Yeah, that's probably what Grzegorz meant. You would just need to call ngx_supervisord_execute(uscf, NGX_SUPERVISORD_CMD_STOP, backend_number, NULL) and then all ngx_supervisord-aware load balancers (upstream_fair, round_robin & ip_hash) would automagically stop using failed backend until you would execute NGX_SUPERVISORD_CMD_START. Full API spec is available at: http://github.com/FRiCKLE/ngx_supervisord/blob/master/patches/ngx_http_upstream_round_robin.patch At the moment one would need to specify "supervisord none;" in order to enable supervisord-less configuration, because there is no such call in API, but I could add this in next release if you would like to use it. Best regards, Piotr Sikora < piotr.sikora@frickle.com >
on 2010-03-02 11:24
> @Piotr: if the health check plugin adapts to ngx_supervisord API, I > guess the API spec could use s/_supervisord//ig, no? A bunch of #defines > should suffice. Sure, why not ;) Best regards, Piotr Sikora < piotr.sikora@frickle.com >
on 2010-03-02 19:18
On Mar 02, Piotr Sikora wrote: >enable supervisord-less configuration, because there is no such call in API, >but I could add this in next release if you would like to use it. This sonuds like what I needed. So let me rephrase the problem in its entirity: (1) Under normal circumstances, nginx would use proxy_next_upstream in conjunction with max_fails and fail_timeout (for the rr module) to declare an upstream as up or down. This is an inline check since it is monitoring real traffic. (2) In addition, the health-check module provides an out of band health check mechanism wherein, it periodically polls a specific url and uses the HTTP status/body to determine if an upstream needs to be marked as up or down Both these styles have their own benefits. The first style keeps track of the health by looking at the response of actual requests. This is important since a health check url does not automatically indicate the health of your real application. The second style is needed in cases where we plan to do some maintanence activity on an upstream server and want to proactively not send traffic to it. Typical example is when you want to push new software, check with a couple of requests and see if your app is behaving well and if all looks fine, direct traffic to it. In the absence of priority between the two styles of checking, we could end up with a flapping upstream status. The logical priority seems to be that #2 wins over #1. So, if the health check url says an upstream server is down, no traffic should be sent it way and the health status evaluation based on style #1 should be ignored. If the health check deems and upstream to be up, then the outcome of #1 is the final status. So where do we get these features from: - #1 is provided by the stock upstream modules - #2 is provided by the health check module - the ngx_supervisord module seems have the hooks that will let us achieve the prioritization once the health-check module uses this feature To get all of this running, we would need 2 patches on the upstream module; one for supervisord and the other for health-check and the health-check module itself will have to invoke ngx_supervisord_execute to mark an upstream as up or down. I will not have time before this weekend to get started on merging these; so if someone gets down to doing it earlier, thanks :-)
on 2010-03-02 19:27
On Tue, Mar 2, 2010 at 10:17 AM, Arvind Jayaprakash <work@anomalizer.net> wrote: > (2) In addition, the health-check module provides an out of band health > check mechanism wherein, it periodically polls a specific url and uses > the HTTP status/body to determine if an upstream needs to be marked as > up or down +1 This has been something in my mind that will help advance nginx further as a load balancing option.
on 2010-03-02 23:11
> The second style is needed in cases where we plan to do some maintanence > activity on an upstream server and want to proactively not send traffic > to it. Typical example is when you want to push new software, check with > a couple of requests and see if your app is behaving well and if all > looks fine, direct traffic to it. Actually, you don't need health-checks for that, ngx_supervisord provides that functionality out-of-the-box. If you plan on taking servers down for maintenance, you can use supervisord_stop and supervisord_start handlers. Please take a look at example configuration #2, because it does exactly that: http://labs.frickle.com/nginx_ngx_supervisord/README > In the absence of priority between the two styles of checking, we could > end up with a flapping upstream status. The logical priority seems to be > that #2 wins over #1. So, if the health check url says an upstream > server is down, no traffic should be sent it way and the health status > evaluation based on style #1 should be ignored. If the health check > deems and upstream to be up, then the outcome of #1 is the final status. In-line checks are considered failures. When you take down server with ngx_supervisord, it's marked down (same way as if you would add "down" in your configuration and reload nginx), so it takes priority. > I will not have time before this weekend to get started on merging > these; so if someone gets down to doing it earlier, thanks :-) I'll try to find some time today, can't promise anything though. Best regards, Piotr Sikora < piotr.sikora@frickle.com >
on 2010-03-03 03:48
On Mar 02, Piotr Sikora wrote: >that: >http://labs.frickle.com/nginx_ngx_supervisord/README The only philosophical objection I have to this style is having something to do on the LB servers to change and upstream status. I wanted a facility wherein something can be done on the upstream server (rm the health check file) to remove it out of service. I come from a world wherein any sort of access on the LB servers is tightly controlled and the people managing the upstreams (application servers) can plan a maintanence without ever involving the LB folks. Am also not a fan of the supervisord_start/supervisord_stop directives since managing the security aspects of it becomes a hassle. I am however a fan of supervisord_inherit_backend_status which is why I wanted to integrate it with the health-check plugin :-)
on 2010-03-03 05:09
> since managing the security aspects of it becomes a hassle. > > I am however a fan of supervisord_inherit_backend_status which is why I > wanted to integrate it with the health-check plugin :-) I can see your point. The question which arises now is: Should health-check or any other ngx_supervisord-aware load balancer be able to "enable" backend server which was administratively taken out of the rotation with "server A.B.C.D down;" in nginx.conf? Best regards, Piotr Sikora < piotr.sikora@frickle.com >
on 2010-03-08 22:16
> I'll try to find some time today, can't promise anything though.
It took a little more time than expected, but I just pushed modified
version
of health-check module into my temporary repository:
http://github.com/PiotrSikora/healthcheck_nginx_upstreams/
It communicates with ngx_supervisord via its API, which means that
servers
are taken out of the rotation when healthcheck fails.
DISCLAIMER: Health-check module doesn't work for me at all (checks
always
time out and because of that servers are taken out of the rotation by
ngx_supervisord), but if it works for you then this modified version
should
work as well.
Best regards,
Piotr Sikora < piotr.sikora@frickle.com >
on 2010-03-09 18:42
> DISCLAIMER: Health-check module doesn't work for me at all
To redeem the module, I checked over your config and it was missing the
"Host:" header in your healthcheck_send. Try using something similar to
the sample config:
healthcheck_send "GET /health HTTP/1.1" 'Host: www.ahost.com'
'Connection: close';
Posted at Nginx Forum:
http://forum.nginx.org/read.php?2,57810,61920#msg-61920
on 2010-03-09 19:48
Hello Jack, > To redeem the module, I checked over your config and it was missing the > "Host:" header in your healthcheck_send. Try using something similar to > the sample config: > > healthcheck_send "GET /health HTTP/1.1" 'Host: www.ahost.com' > 'Connection: close'; No, it wasn't. Missing header wouldn't yield timeout. But just for the sake of it, I tested it with the above line and it didn't help. Like I said yesterday, I don't really have time to fully investigate this right now, but I'll try to narrow down / fix the problem over the weekend. Best regards, Piotr Sikora < piotr.sikora@frickle.com >
Please log in before posting. Registration is free and takes only a minute.
Existing account
(Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
Log in with Google account | Log in with Yahoo account
No account? Register here.