Hello!
On Fri, Nov 16, 2012 at 10:54:51AM -0500, pliljenberg wrote:
server which was already considered down?
One server X receives a request which takes 300+ seconds to complete . That
request gets dropped by nginx due to the read timeout (as expected).
When this happens the server X is disabled and all upcoming request are sent
to server Y instead.
My interpretation of the configuration was that the server X would still get
requests since it only had 1 failure (and it 3 as configured) during the
last 30 seconds?
The intresing part is what happens before “one server X receives
a request…”. Is it working normally and handles other requests?
Or it was already considered dead and the request in question is
one to check if it’s alive?
To illustrate, here is what happens with normally working server
(one server on port 9999 is dead, and one at 8080 is responding
normally, fail_timeout=30s, max_fails=3, ip_hash, just started
nginx):
2012/11/16 20:23:29 [debug] 35083#0: *1 connect to 127.0.0.1:9999, fd:17
#2
2012/11/16 20:23:29 [debug] 35083#0: *1 connect to 127.0.0.1:8080, fd:17
#3
2012/11/16 20:23:29 [debug] 35083#0: *5 connect to 127.0.0.1:9999, fd:17
#6
2012/11/16 20:23:29 [debug] 35083#0: *5 connect to 127.0.0.1:8080, fd:17
#7
2012/11/16 20:23:30 [debug] 35083#0: *9 connect to 127.0.0.1:9999, fd:17
#10
2012/11/16 20:23:30 [debug] 35083#0: *9 connect to 127.0.0.1:8080, fd:17
#11
2012/11/16 20:23:31 [debug] 35083#0: *13 connect to 127.0.0.1:8080,
fd:17 #14
2012/11/16 20:23:31 [debug] 35083#0: *16 connect to 127.0.0.1:8080,
fd:17 #17
2012/11/16 20:23:32 [debug] 35083#0: *19 connect to 127.0.0.1:8080,
fd:17 #20
2012/11/16 20:23:33 [debug] 35083#0: *22 connect to 127.0.0.1:8080,
fd:17 #23
2012/11/16 20:23:34 [debug] 35083#0: *25 connect to 127.0.0.1:8080,
fd:17 #26
2012/11/16 20:23:34 [debug] 35083#0: *28 connect to 127.0.0.1:8080,
fd:17 #29
2012/11/16 20:23:35 [debug] 35083#0: *31 connect to 127.0.0.1:8080,
fd:17 #32
As you can see, first 3 requests try to reach port 9999 - because
of max_fails=3.
On the other hand, as long as fail_timeout=30s passes, only one
request try to reach 9999:
2012/11/16 20:24:37 [debug] 35083#0: *34 connect to 127.0.0.1:9999,
fd:16 #35
2012/11/16 20:24:37 [debug] 35083#0: *34 connect to 127.0.0.1:8080,
fd:16 #36
2012/11/16 20:24:38 [debug] 35083#0: *38 connect to 127.0.0.1:8080,
fd:16 #39
2012/11/16 20:24:39 [debug] 35083#0: *41 connect to 127.0.0.1:8080,
fd:16 #42
That’s because situations of “normal working server” and “dead
server we are trying to use again” are a bit different.
–
Maxim D.
http://nginx.com/support.html