Any advice on tracking down the cause of “worker process XXXXX exited on
signal 11” errors? Thank you in advance for your help.
We have been running NGINX 0.7.65 on freeBSD for a few months. Because
of issues getting the fail-over to work properly, 1 week ago we migrated
from freeBSD to Linux (Ubuntu) running version 0.7.65. In this first
week, we saw perhaps 2 dozen “worker process XXXXX exited on signal 11”
errors per hour and have been experiencing dropped web connections at a
rate that seems to coincide with the “exited on signal 11” errors. We
tried many different configuration changes, and finally this afternoon
upgraded to 0.8.36. Unfortunately we continue to see the “worker process
XXXXX exited on signal 11” errors.
Other possible factors: We have been using keepalived to verify that
NGINX is accepting web traffic on port 80 every second.
This is our configuration file:
# Nginx configuration file
user nobody;
worker_processes 1;
events {
worker_connections 1024;
}
http {
gzip on;
gzip_types text/plain text/css application/x-javascript
application/javascript;
gzip_disable "MSIE [1-6]\.";
# This controls upload size
client_max_body_size 45M;
# How long to wait for upstream server response (seconds)
proxy_read_timeout 600;
upstream php5 {
server 192.168.1.121 weight=1;
server 192.168.1.123 weight=1;
}
# keeps connection to same web servers
upstream webserversessions {
ip_hash;
server 192.168.1.121;
server 192.168.1.123;
}
## Default for all sites
server {
listen 80;
location / {
proxy_pass http://php5$request_uri;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For
$proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-By $server_name;
}
# Add expires headers
location ~* ^.+\.(js)$ {
expires modified +24h;
proxy_pass http://php5$request_uri;
}
# Stats reporting
location /nginx_status {
stub_status on;
access_log off;
allow 192.168.234.0/24;
allow 192.168.1.0/24;
deny all;
}
}
## maintain session required for openID
server {
listen 80;
server_name community.llli.org;
location / {
proxy_pass http://webserversessions$request_uri;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For
$proxy_add_x_forwarded_for;
}
}
## For site that needs sessions
server {
listen 80;
server_name files.golightly.com;
location / {
proxy_pass http://webserversessions$request_uri;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For
$proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-By $server_name;
}
}
## SSL sites ##
server { # Need to add this server section with unique IP for each
SSL site we serve
listen 38.127.224.114:443;
ssl on;
ssl_certificate /etc/ssl/www.latchon.org.pem;
ssl_certificate_key
/etc/ssl/private/www.latchon.org.key;
ssl_protocols SSLv2 SSLv3 TLSv1;
ssl_ciphers
ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP;
ssl_prefer_server_ciphers on;
location / {
proxy_pass http://php5$request_uri;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For
$proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-By $server_name:443;
}
}
server {
listen 38.127.224.112:443;
ssl on;
ssl_certificate
/etc/ssl/networking.cccu.org.pem;
ssl_certificate_key
/etc/ssl/private/networking.cccu.org.key;
ssl_protocols SSLv2 SSLv3 TLSv1;
ssl_ciphers
ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP;
ssl_prefer_server_ciphers on;
location / {
proxy_pass http://php5$request_uri;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For
$proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-By $server_name:443;
}
}
server {
listen 38.127.224.113:443;
ssl on;
ssl_certificate
/etc/ssl/backstage.codenomicon.com.chain.pem;
ssl_certificate_key
/etc/ssl/private/backstage.codenomicon.com.key;
ssl_protocols SSLv2 SSLv3 TLSv1;
ssl_ciphers
ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP;
ssl_prefer_server_ciphers on;
location / {
proxy_pass http://php5$request_uri;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For
$proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-By $server_name:443;
}
}
}
Posted at Nginx Forum:
On Fri, Apr 30, 2010 at 03:22:20AM -0400, DaleMcGrew wrote:
Any advice on tracking down the cause of “worker process XXXXX exited on signal 11” errors? Thank you in advance for your help.
We have been running NGINX 0.7.65 on freeBSD for a few months. Because of issues getting the fail-over to work properly, 1 week ago we migrated from freeBSD to Linux (Ubuntu) running version 0.7.65. In this first week, we saw perhaps 2 dozen “worker process XXXXX exited on signal 11” errors per hour and have been experiencing dropped web connections at a rate that seems to coincide with the “exited on signal 11” errors. We tried many different configuration changes, and finally this afternoon upgraded to 0.8.36. Unfortunately we continue to see the “worker process XXXXX exited on signal 11” errors.
Other possible factors: We have been using keepalived to verify that NGINX is accepting web traffic on port 80 every second.
This is our configuration file:
...
Could you provide
$ nginx -v
$ uname -a
–
Sergey A. Osokin,
[email protected]
[email protected]
[root@lb01 ~]# /usr/local/nginx/sbin/nginx -v
nginx version: nginx/0.8.36
[root@lb01 ~]# uname -a
Linux lb01.golightly.com 2.6.18-164.el5 #1 SMP Thu Sep 3 03:28:30 EDT
2009 x86_64 x86_64 x86_64 GNU/Linux
Posted at Nginx Forum:
Hello!
On Fri, Apr 30, 2010 at 03:22:20AM -0400, DaleMcGrew wrote:
different configuration changes, and finally this afternoon
upgraded to 0.8.36. Unfortunately we continue to see the “worker
process XXXXX exited on signal 11” errors.
Could you please show nginx -V output, and obtain coredump and
show backtrace? Configuring something like this in nginx.conf
should be enough to obtain one even in Linux:
working_directory /path/to/cores;
worker_rlimit_core 500M;
Note that nginx workers should be able to write to /path/to/cores
directory.
Also please make sure you have no third party modules/patches
compiled in (and/or reproduce the problem without them, if any).
[…]
location / {
proxy_pass http://php5$request_uri;
Just curious: why do you use this form instead of
proxy_pass http://php5;
?
[…]
# Add expires headers
location ~* ^.+\.(js)$ {
Just a note: there is no need to use “^.+”. And brackets just
produce capture which is not used. The following string will do
the same with less cpu burn:
location ~* \.js$ {
[…]
Maxim D.
Hello!
On Mon, May 03, 2010 at 01:37:20AM -0400, dflook wrote:
I’m working together with Dale on this issue. We turned on debug
logging, and each time the worker process segfaults, it seems to
be right after checking SSL handshake. Am I reading this
correctly? Here are two examples in the excerpt below:
[…]
2010/05/02 22:27:53 [debug] 18478#0: *17046 posix_memalign: 0000000009336C10:4096 @16
2010/05/02 22:27:53 [debug] 18478#0: *17046 http check ssl handshake
2010/05/02 22:27:53 [debug] 18478#0: *17046 https ssl handshake: 0x16
2010/05/02 22:27:53 [notice] 15605#0: signal 17 (SIGCHLD) received
Yes, it’s right after nginx got first bytes of SSL handshake and
passed them to OpenSSL. This may indicate either bug in openssl
library (or some corruption in your particular installation) or
some openssl-related bug in nginx.
Have you tried to obtain backtrace as I previously suggested?
Maxim D.
I’m working together with Dale on this issue. We turned on debug
logging, and each time the worker process segfaults, it seems to be
right after checking SSL handshake. Am I reading this correctly? Here
are two examples in the excerpt below:
2010/05/02 22:27:52 [debug] 18478#0: worker cycle
2010/05/02 22:27:52 [debug] 18478#0: epoll timer: 59998
2010/05/02 22:27:53 [debug] 18478#0: epoll: fd:7 ev:0001
d:00000000092BFB20
2010/05/02 22:27:53 [debug] 18478#0: accept on 38.127.224.114:443,
ready: 0
2010/05/02 22:27:53 [debug] 18478#0: posix_memalign:
00000000092724B0:256 @16
2010/05/02 22:27:53 [debug] 18478#0: *17046 accept: 66.249.68.173 fd:22
2010/05/02 22:27:53 [debug] 18478#0: *17046 event timer add: 22:
60000:1272864533191
2010/05/02 22:27:53 [debug] 18478#0: *17046 epoll add event: fd:22 op:1
ev:80000001
2010/05/02 22:27:53 [debug] 18478#0: timer delta: 201
2010/05/02 22:27:53 [debug] 18478#0: posted events 0000000000000000
2010/05/02 22:27:53 [debug] 18478#0: worker cycle
2010/05/02 22:27:53 [debug] 18478#0: epoll timer: 59797
2010/05/02 22:27:53 [debug] 18478#0: epoll: fd:22 ev:0001
d:00000000092BFDE1
2010/05/02 22:27:53 [debug] 18478#0: *17046 malloc:
0000000009257330:1248
2010/05/02 22:27:53 [debug] 18478#0: *17046 posix_memalign:
00000000092617F0:256 @16
2010/05/02 22:27:53 [debug] 18478#0: *17046 malloc:
000000000928EB00:1024
2010/05/02 22:27:53 [debug] 18478#0: *17046 posix_memalign:
0000000009336C10:4096 @16
2010/05/02 22:27:53 [debug] 18478#0: *17046 http check ssl handshake
2010/05/02 22:27:53 [debug] 18478#0: *17046 https ssl handshake: 0x16
2010/05/02 22:27:53 [notice] 15605#0: signal 17 (SIGCHLD) received
2010/05/02 22:27:53 [alert] 15605#0: worker process 18478 exited on
signal 11
2010/05/02 22:27:53 [debug] 15605#0: wake up, sigio 0
2010/05/02 22:27:53 [debug] 15605#0: reap children
2010/05/02 22:27:53 [debug] 15605#0: child: 0 18478 e:0 t:1 d:0 r:1 j:0
2010/05/02 22:27:53 [debug] 15605#0: channel 3:4
2010/05/02 22:27:53 [debug] 18584#0: malloc: 00000000092A4CD0:6144
2010/05/02 22:27:53 [debug] 18584#0: malloc: 00000000092BFA70:180224
2010/05/02 22:27:53 [debug] 18584#0: malloc: 00000000092EBA80:106496
2010/05/02 22:27:53 [debug] 18584#0: malloc: 0000000009305A90:106496
2010/05/02 22:27:53 [debug] 18584#0: epoll add event: fd:6 op:1
ev:00000001
2010/05/02 22:27:53 [debug] 18584#0: epoll add event: fd:7 op:1
ev:00000001
2010/05/02 22:27:53 [debug] 18584#0: epoll add event: fd:8 op:1
ev:00000001
2010/05/02 22:27:53 [debug] 18584#0: epoll add event: fd:9 op:1
ev:00000001
2010/05/02 22:27:53 [debug] 18584#0: epoll add event: fd:4 op:1
ev:00000001
2010/05/02 22:27:53 [debug] 18584#0: setproctitle: “nginx: worker
process”
2010/05/02 22:27:53 [debug] 18584#0: worker cycle
2010/05/02 22:27:53 [debug] 18584#0: epoll timer: -1
2010/05/02 22:27:53 [notice] 15605#0: start worker process 18584
2010/05/02 22:27:53 [debug] 15605#0: sigsuspend
2010/05/02 22:27:53 [debug] 18584#0: epoll: fd:7 ev:0001
d:00000000092BFB20
2010/05/02 22:27:53 [debug] 18584#0: accept on 38.127.224.114:443,
ready: 0
2010/05/02 22:27:53 [debug] 18584#0: posix_memalign:
0000000009272CF0:256 @16
2010/05/02 22:27:53 [debug] 18584#0: *17047 accept: 66.249.68.173 fd:3
2010/05/02 22:27:53 [debug] 18584#0: *17047 event timer add: 3:
60000:1272864533371
2010/05/02 22:27:53 [debug] 18584#0: *17047 epoll add event: fd:3 op:1
ev:80000001
2010/05/02 22:27:53 [debug] 18584#0: timer delta: 40
2010/05/02 22:27:53 [debug] 18584#0: posted events 0000000000000000
2010/05/02 22:27:53 [debug] 18584#0: worker cycle
2010/05/02 22:27:53 [debug] 18584#0: epoll timer: 60000
2010/05/02 22:27:53 [debug] 18584#0: epoll: fd:3 ev:0001
d:00000000092BFDE0
2010/05/02 22:27:53 [debug] 18584#0: *17047 malloc:
00000000092ABA60:1248
2010/05/02 22:27:53 [debug] 18584#0: *17047 posix_memalign:
000000000924FDA0:256 @16
2010/05/02 22:27:53 [debug] 18584#0: *17047 malloc:
000000000923D700:1024
2010/05/02 22:27:53 [debug] 18584#0: *17047 posix_memalign:
00000000092A3940:4096 @16
2010/05/02 22:27:53 [debug] 18584#0: *17047 http check ssl handshake
2010/05/02 22:27:53 [debug] 18584#0: *17047 https ssl handshake: 0x80
2010/05/02 22:27:53 [notice] 15605#0: signal 17 (SIGCHLD) received
2010/05/02 22:27:53 [alert] 15605#0: worker process 18584 exited on
signal 11
Posted at Nginx Forum:
Hi Maxim, thank you so much for your feedback. Last I talked to Dflook
he was having trouble getting the backtrace and core dump due to
compilation problems. He is going to keep trying.
I just ran this per your request:
[root@lb01 ~]# /usr/local/nginx/sbin/nginx -V
nginx version: nginx/0.8.36
built by gcc 4.1.2 20080704 (Red Hat 4.1.2-46)
TLS SNI support disabled
configure arguments: --with-http_ssl_module
–with-http_stub_status_module
–http-log-path=/usr/local/www/logs/access.log
–error-log-path=/usr/local/www/logs/error.log --with-debug
Posted at Nginx Forum: