Maybe a bug in Nginx 6.0.30 x86_64

Hi All,

I’ve an issue with Nginx 0.6.30. So far it seems that the following
things are needed to reproduce this bug:

  • keepalives enabled,
  • Nginx running on Linux x86_64,
  • many concurrent connections (> 300).

Nginx crashes on a SIGSEGV, but it looks like a stack overflow caused by
recursive calls according to gdb.

Here is the setup:

Ubuntu 8.04 64 bits, kernel 2.6.24
Amd 64 3000+ (single core), 512 MB

Nginx 0.6.30 configuration:

1 worker, no master process, no daemon, epoll, sendfile, worker
connections: 8000, 1 server serving a 10 kB index.html static file.

Compilation:

./configure --prefix=.

Then using Apache Bench like this:

ab -k -c 500 -n 1000000 http://localhost:8000/index.html

will cause a stack overflow (after ~24000 requests).

Using ab without the keepalive option (-k) works on the same computer
without problem so far even with a large number of simultaneous
connections ie:

ab -c 4000 -n 1000000 http://localhost:8000/index.html

Never could reproduce it on my main computer (Ubuntu 8.04 i386 32 bits).
I will try to do another tests on a q6600 (quad core) running Ubuntu
8.04 x86_64 and keep you informed.

A gdb session showing the first 500 stack backtraces is attached, hope
it will help.

Best regards

François Battail <[email protected]…> writes:

I confirm the bug an a q6600 running Ubuntu 8.04 x86_64 with 4 GB of
ram. Same
settings, except that I need to use a large number of concurrent
connections and
requests:

ab -k -c 5000 -n 10000000 http://localhost:8000/

Same bug: stack overflow because of recursive calls. May be it is linked
to the
way ab works. Unable to reproduce this bug on my main computer because
it’s an
AMD X2 (working in 32 bits) with a fileno of 1024, so may be not enough
stress…

Very strange since there’s only one worker, no more idea for the moment.

Best regards.

On Sun, May 11, 2008 at 11:04:02AM +0200, Fran?ois Battail wrote:

Here is the setup:

connections ie:

ab -c 4000 -n 1000000 http://localhost:8000/index.html

Never could reproduce it on my main computer (Ubuntu 8.04 i386 32 bits).
I will try to do another tests on a q6600 (quad core) running Ubuntu
8.04 x86_64 and keep you informed.

A gdb session showing the first 500 stack backtraces is attached, hope
it will help.

It seems that you are able to run ab/nginx in resonance:

  1. ab issues request, nginx gets epoll notification and sends response,
  2. kernel schedules ab, it gets response, sends a new request,
  3. kernel schedules nginx, it reads the request, sends a response,
  4. goto #2.

So nginx processes all requests after single epoll notification without
any blocking. This may happen only on greedy event notification methods:
epoll and rtsig. Also I’m not sure that it can be reproduced from remote
host.
I see two ways to fix it: the simplest is to limit number of keepalive
requests.

BTW, please note, that running nginx without master process is not good
for production: it has problems with reconfiguration, etc.

Igor S. <[email protected]…> writes:

It seems that you are able to run ab/nginx in resonance:

  1. ab issues request, nginx gets epoll notification and sends response,
  2. kernel schedules ab, it gets response, sends a new request,
  3. kernel schedules nginx, it reads the request, sends a response,
  4. goto #2.

Yes, running AB and Nginx on the same computer is insane since AB is
more CPU
hungry than Nginx :wink:

So nginx processes all requests after single epoll notification without
any blocking. This may happen only on greedy event notification methods:
epoll and rtsig. Also I’m not sure that it can be reproduced from remote
host. I see two ways to fix it: the simplest is to limit number of
keepalive requests.

Well, I used to test a custom version of Nginx with keepalive enabled
without
any problem and with a lot of concurrent connections (content is
generated
dynamically without blocking calls for the moment, but it’s completely
specific
to my case). I understand your point when you write resonance but I have
still
an issue on understanding why keepalive could be an issue triggering
recursive
calls since without keepalive it works fine.

BTW, please note, that running nginx without master process is not good
for production: it has problems with reconfiguration, etc.

Thank you, but it was just for debugging purpose, I will never run
Nginx
without a master process in production; just want to provide you a
limited
subset to help for investigation.

Best regards.

Igor S. <[email protected]…> writes:

Because after sending response and before going to keepalive state
nginx tests if there is data to read. And if data is, nginx goes to a next
request: ngx_http_keepalive_handler() calls ngx_http_init_request().

OK, thank you, it’s clear now. But I think there’s still an issue
allowing
remote attacks (when 10 Gb network cards will be the norm); you say that
there
were two solutions but just told us about limiting the number of
keepalive, may
I ask you what is the second option?

Thank you for your time and best regards.

On Sun, May 11, 2008 at 04:23:16PM +0000, Fran??ois Battail wrote:

hungry than Nginx :wink:
to my case). I understand your point when you write resonance but I have still
an issue on understanding why keepalive could be an issue triggering recursive
calls since without keepalive it works fine.

Because after sending response and before going to keepalive state
nginx tests if there is data to read. And if data is, nginx goes to a
next
request: ngx_http_keepalive_handler() calls ngx_http_init_request().

On Sun, May 11, 2008 at 02:24:56PM +0000, Fran??ois Battail wrote:

AMD X2 (working in 32 bits) with a fileno of 1024, so may be not enough stress…

Very strange since there’s only one worker, no more idea for the moment.

Try the attached patch.

Igor S. <[email protected]…> writes:

Try the attached patch.

Thank you, I will! Forget my last message. I will keep you informed.

Best regards.

Igor S. <[email protected]…> writes:

Try the attached patch.

It’s perfect :slight_smile: even on my old AMD 64 3000+ with:

ab -k -c 4000 -n 10000000 http://localhost:8000/index.html (!)

Thank you so much, best regards.

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs