Strange nginx hang after ~18600 requests

I’m developing a module that uses a handler module and header+body
filter
module
at my handler phase (registered in the nginx rewrite phase) i stop the
request (return NGX_OK), then issue a subrequest to another server, get
the
result back (hook function), then continue to the header+body filter,
change
the headers + body and calls the next filters accordingly.
also, the subrequest runs thru a proxy , here’s the relevant conf lines:

(general:)
keepalive_timeout 65;
gzip on;
proxy_http_version 1.1;
worker_connections 1024;

location / { // this location is where the ab tries to get (see below
the
ab test)
include
proxy_pass http://server
}

    location /def1 { // this is the subrequest uri
    proxy_buffers           8 128k;
    proxy_buffer_size       128k;
    proxy_busy_buffers_size 128k;

    proxy_pass              http://server2/page.php // another nginx

runs fastcgi
}

all works fine, untill i run an “ab” test against my nginx (10/12
workers)
that runs on a multi-core, heavy duty linux server, and here’s my ab
line:
ab -c 50 -n 20000 http://…myserver…
the actual rate is ~1000 requests per second, very high…

but almost exactly after 18500-18600 reuests (that ran smoothly without
any
errors, debug level INFO with printouts of my own) - the nginx hangs -
not
stuck - since i see in debug level “debug” its epoll loop runing ok, but
not
recv nor proceesing any other requests although they continue to come
from
my ab test

any suggestions please ?

thanks

Gad

Posted at Nginx Forum:

addon: the stuck occurs right after the last (~18600) subrequet has been
sent to the server def1)

Posted at Nginx Forum:

Hello!

On Tue, Dec 18, 2012 at 10:24:59AM -0500, gadh wrote:

I’m developing a module that uses a handler module and header+body filter
module
at my handler phase (registered in the nginx rewrite phase) i stop the
request (return NGX_OK), then issue a subrequest to another server, get the
result back (hook function), then continue to the header+body filter, change
the headers + body and calls the next filters accordingly.
also, the subrequest runs thru a proxy , here’s the relevant conf lines:

[…]

but almost exactly after 18500-18600 reuests (that ran smoothly without any
errors, debug level INFO with printouts of my own) - the nginx hangs - not
stuck - since i see in debug level “debug” its epoll loop runing ok, but not
recv nor proceesing any other requests although they continue to come from
my ab test

any suggestions please ?

Most likely your module causes some resouce leak, which later
results in a hang. Hard to say more without seeing the code which
causes the problem.


Maxim D.

hi maxim
since i cannot send you my code for now, could you point me to the
reason/s
to the lack of resources, so i can search for a solution ? can you
suggest
on a monitoring/debug tool that can help ?
(valgrind could not find any specific problem rather than the regular
notes
on the nginx core code, which BTW i suggest to monitor)

Posted at Nginx Forum:

Hello!

On Tue, Dec 18, 2012 at 11:32:32AM -0500, gadh wrote:

hi maxim
since i cannot send you my code for now, could you point me to the reason/s
to the lack of resources, so i can search for a solution ? can you suggest
on a monitoring/debug tool that can help ?

I would recommend the following, in no particular order:

  • Try looking at various trivial things like open files/sockets
    counters, nginx stub status output and so on.

  • Try looking though debug log of a single request execution, and
    making sure you understand what goes on, there are no unexpected
    things and the request is properly finalized.

  • Try producing a reduced test case which is as simple as possible
    in contrast to your original code, but is enough to reproduce
    the problem.


Maxim D.

if i want to send you a some source files, to which email to send ?

Posted at Nginx Forum:

could you tell me where i can find (in the nginx code ) the table size
of
the open sockets/connections ? maybe its related

Posted at Nginx Forum:

thanks for the fast reply, Maxim
the code is complicated and i cannot send it all.
more info: when i set the proxy_connect_timeout from default 60s to 2s
(the
upstream server is close enough) , i could see that in the hang state,
all
workers were waiting for an answer from the upstream server (thru the
proxy)
and after that the nginx hanged (so lack of resources also occurs after
the
timeout)

any suggestions ?
tnx
Gad

Posted at Nginx Forum:

Hello!

On Wed, Dec 19, 2012 at 04:00:27AM -0500, gadh wrote:

if i want to send you a some source files, to which email to send ?

If you want private communication, please consider commercial
support options, see Technical Support for NGINX and NGINX Plus Software for details.


Maxim D.