Bug Report: error_page causes hot looping under heavy load

dubstep · March 30, 2012, 2:53pm

I compiled a stable release 1.0.14 with no 3rd party modules and under
heavy load, gdb is showing hot looping around when error_page directive
is
used. The CPU (all 8 cores) reaches 100% usage and the server stops
responding to further requests

I am happy to share my configurations but its a very simple arrangement
with varnish server being the backend for proxy_pass. I am using
apache-benchmark to test it.

Thanks
Sparsh G.

Sparsh_G · March 30, 2012, 3:37pm

Upon further debugging, the possible issue is at the following line:

src/http/ngx_http_upstream_round_robin.c:433
for ( ;; ) {
rrp->current = ngx_http_upstream_get_peer(rrp->peers);

           ngx_log_debug2(NGX_LOG_DEBUG_HTTP, pc->log, 0,
                          "get rr peer, current: %ui %i",
                          rrp->current,

rrp->peers->peer[rrp->current].current_weight);

           n = rrp->current / (8 * sizeof(uintptr_t));
           m = (uintptr_t) 1 << rrp->current % (8 *

sizeof(uintptr_t));

The pc->tries is a ridiculously large number, which causes looping. Here
is
the configuration used to reproduce the error:

worker_processes 1;
worker_rlimit_nofile 32768;
error_log /var/log/nginx/error.log warn;
events { worker_connections 16384; use epoll; }

http {
access_log off;

upstream a {
        server 10.56.140.8 backup;
        server 10.56.140.2 backup;
        server 10.56.140.4 backup;
        server 10.56.140.6 backup;
}

upstream b {
        server 10.56.140.16 max_fails=100 fail_timeout=5s;
        server 10.56.140.8 backup;
        server 10.56.140.2 backup;
        server 10.56.140.4 backup;
        server 10.56.140.6 backup;
}

server {
    listen 80;

    location = /j.php {
        include /etc/nginx/proxy.conf; #with some timeout variables

which are not relevant here I think
proxy_pass http://b;
error_page 403 404 500 502 503 504 = @fallback;
}

    location @fallback {
            proxy_pass http://a;
    }

}
}

Sparsh G.

Sparsh_G · March 31, 2012, 1:54am

Hello!

On Fri, Mar 30, 2012 at 07:06:17PM +0530, Sparsh G. wrote:

rrp->peers->peer[rrp->current].current_weight);
worker_rlimit_nofile 32768;
server 10.56.140.6 backup;
server {
proxy_pass http://a;
}
}
}

Upstream “a” have only backup servers, and this is what causes
problems. This is in fact configuration error (you have to define
at least one non-backup server, else “backup” flag doesn’t make
sense). But obviously this should be either rejected during
configuration parsing or handled somehow, CPU hog is a bug, thank
you for report.

With the following patch nginx will reject such configuration
during configuration parsing:

— a/src/http/ngx_http_upstream_round_robin.c
+++ b/src/http/ngx_http_upstream_round_robin.c
@@ -49,6 +49,13 @@ ngx_http_upstream_init_round_robin(ngx_c
n += server[i].naddrs;
}

```
   if (n == 0) {
```

       ngx_log_error(NGX_LOG_EMERG, cf->log, 0,

                     "no servers in upstream \"%V\" in %s:%ui",

                     &us->host, us->file_name, us->line);

```
       return NGX_ERROR;
```
```
   }
```
```
   peers = ngx_pcalloc(cf->pool, 
```

sizeof(ngx_http_upstream_rr_peers_t)
+ sizeof(ngx_http_upstream_rr_peer_t) *
(n - 1));
if (peers == NULL) {

Maxim D.

responding to further requests

I am happy to share my configurations but its a very simple arrangement
with varnish server being the backend for proxy_pass. I am using
apache-benchmark to test it.

Thanks
Sparsh G.