Upstream Backup Issue

Hi,

We have the following configuration, running nginx 0.6.34:


upstream frontend {
server host_a;
server host_b;
server host_c;
server host_d;
server failover_host backup;
}

server {
listen 80;
server_name _;
error_page 502 503 504 /error.png;

     location / {
         proxy_set_header Host $host;
         proxy_set_header X-Real-IP $remote_addr;
         proxy_pass http://frontend;
         proxy_connect_timeout 5;
         proxy_send_timeout 30;
         proxy_read_timeout 30;
         charset utf-8;
     }

}

Everything has been working great for weeks, and then today we started
seeing the following up stream errors:

[error] 28016#0: *742426639 upstream sent invalid header while reading
response header from upstream, client: CLIENT, server: _, request:
“GET REQUEST HTTP/1.1”, upstream: “ADDRESS”, host: “HOST”

The errors were caused by a malformed HTTP redirect message, that we
were constructing upstream, that contained embedded newline characters
in the “Location” HTTP header portion of the response. It seems like
what was happening was that once I’d get 4 of those errors, the backup
would kick in for a few seconds and then everything would recover.

Is this correct behavior? The upstream hosts were technically not
down, and were all responding so there weren’t any timeout issues. Is
there a way to prevent this from happening unless all of the upstream
hosts are truly down or non-responsive, as opposed to simply returning
invalid responses? Could this be a bug?

Thanks in advance - love nginx! :slight_smile:

  • n

Hello!

On Thu, Feb 12, 2009 at 11:41:17PM -0500, Nathan Folkman wrote:

server failover_host backup;
proxy_pass http://frontend;

Is this correct behavior? The upstream hosts were technically not down,
and were all responding so there weren’t any timeout issues. Is there a
way to prevent this from happening unless all of the upstream hosts are
truly down or non-responsive, as opposed to simply returning invalid
responses? Could this be a bug?

Technically you upstream hosts wasn’t HTTP, so it’s perfectly
correct behaviour.

As you can read in docs, nginx threats upstream server as down
according to max_fails and fail_timeout defined for this server.
It considers requests as failed (and counts attempts against
max_fails/fail_timeout) according to proxy_next_upstream
directive, “proxy_next_upstream error timeout” by default. In you
case “error” is triggered (error occured while readin reponse
header from upstream).

Maxim D.