Weird crash over night with nginx

sovok · March 21, 2010, 12:00pm

I got some weird crash of nginx in the kernel logs. The problem is I
cant find out why or what happened. It happened I suppose over night.

This is from the kernel log:

nginx[31486]: segfault at c4 ip 080aacb5 sp bfd79b60 error 4 in
nginx[8048000+73000]
nginx[31484]: segfault at c4 ip 080aacb5 sp bfd79b60 error 4 in
nginx[8048000+73000]
nginx[1528]: segfault at c4 ip 080aacb5 sp bfd79bc0 error 4 in
nginx[8048000+73000]
nginx[1568]: segfault at c4 ip 080aacb5 sp bfd79bc0 error 4 in
nginx[8048000+73000]

and from error_log:
2010/03/21 09:42:35 [notice] 31483#0: signal 17 (SIGCHLD) received
2010/03/21 09:42:35 [alert] 31483#0: worker process 31486 exited on
signal 11
2010/03/21 09:42:35 [notice] 31483#0: start worker process 1528
2010/03/21 09:42:35 [notice] 31483#0: signal 29 (SIGIO) received
2010/03/21 09:43:03 [notice] 31483#0: signal 17 (SIGCHLD) received
2010/03/21 09:43:03 [alert] 31483#0: worker process 31484 exited on
signal 11
2010/03/21 09:43:03 [notice] 31483#0: start worker process 1568
2010/03/21 09:43:03 [notice] 31483#0: signal 29 (SIGIO) received
2010/03/21 09:43:09 [notice] 31483#0: signal 17 (SIGCHLD) received
2010/03/21 09:43:09 [alert] 31483#0: worker process 1528 exited on
signal 11
2010/03/21 09:43:09 [notice] 31483#0: start worker process 1582
2010/03/21 09:43:09 [notice] 31483#0: signal 29 (SIGIO) received
2010/03/21 09:45:09 [notice] 31483#0: signal 17 (SIGCHLD) received
2010/03/21 09:45:09 [alert] 31483#0: worker process 1568 exited on
signal 11
2010/03/21 09:45:09 [notice] 31483#0: start worker process 1757
2010/03/21 09:45:09 [notice] 31483#0: signal 29 (SIGIO) received
2010/03/21 12:00:13 [notice] 31483#0: signal 15 (SIGTERM) received,
exiting

Does any one know what could be the issue?

sovok · March 22, 2010, 4:14pm

And how am I suppose to do the backtrace, I mean I cant reproduce the
error or you just need a simple backtrace?

sovok · March 21, 2010, 7:57pm

Hello!

On Sun, Mar 21, 2010 at 12:00:39PM +0100, Robert G. wrote:

I got some weird crash of nginx in the kernel logs. The problem is I
cant find out why or what happened. It happened I suppose over night.

[…]

Does any one know what could be the issue?

If you want this to be investigated you may want to provide:

nginx -V output
config
backtrace as obtained from coredump via gdb

Maxim D.

sovok · March 25, 2010, 8:51am

Sorry, I cant reproduce the backtrace, but here is the config and nginx
-V:

System: Ubuntu Linux 9.10 Server with kernel: 2.6.33-server (custom
kernel build)

nginx version: nginx/0.8.34
TLS SNI support enabled
configure arguments: --prefix=/applications/nginx
–conf-path=/etc/nginx/nginx.conf --with-http_ssl_module
–with-http_realip_module --with-http_addition_module
–with-http_flv_module --with-http_gzip_static_module
–with-http_sub_module --http-log-path=/var/log/nginx/access_log
–with-http_perl_module --user=www-data --group=www-data
–http-fastcgi-temp-path=/applications/nginx/tmp/fastcgi
–http-client-body-temp-path=/applications/nginx/tmp/client
–http-proxy-temp-path=/applications/nginx/tmp/proxy
–pid-path=/var/run/nginx.pid --error-log-path=/var/log/nginx/error_log
–with-sha1=/usr/lib --with-md5=/usr/lib --with-file-aio

nginx config:

user www-data;
worker_processes 2;
worker_cpu_affinity 0101 1010;
pid /var/run/nginx.pid;
error_log /var/log/nginx/error_log info;

events {
worker_connections 1024;
use epoll;
}

http {
include mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local]
“$request” ’
'$status $body_bytes_sent
“$http_referer” ’
‘"$http_user_agent"
“$http_x_forwarded_for”’;
client_body_timeout 60;
client_header_timeout 60;
send_timeout 60;
server_tokens off;
server_names_hash_bucket_size 256;
server_names_hash_max_size 512;
aio on;
directio 1m;
output_buffers 1 128k;
sendfile off;
tcp_nopush off;
tcp_nodelay off;
keepalive_timeout 0;
gzip on;
gzip_comp_level 9;
gzip_buffers 16 8k;
gzip_http_version 1.0;
gzip_min_length 1024;
gzip_vary on;
gzip_proxied off;
gzip_disable msie6;
gzip_types text/plain text/css text/xml text/javascript
application/x-javascript application/xml application/xml+rss;

include /etc/nginx/site-hosts/;
include /etc/nginx/site-users/;
include /etc/nginx/site-virtual/*;

}

sovok · March 25, 2010, 10:37am

Hello!

On Thu, Mar 25, 2010 at 08:51:17AM +0100, Robert G. wrote:

–with-http_realip_module --with-http_addition_module

[…]

include /etc/nginx/site-hosts/;
include /etc/nginx/site-users/;
include /etc/nginx/site-virtual/*;

You may want to provide included files as well, they are parts of
config. Or at least samples if they are identical.

Maxim D.

sovok · March 25, 2010, 7:41pm

They are over 300 of them, you sure?

sovok · March 22, 2010, 5:00pm

Hello!

On Mon, Mar 22, 2010 at 04:14:33PM +0100, Robert G. wrote:

And how am I suppose to do the backtrace, I mean I cant reproduce the
error or you just need a simple backtrace?

You have to configure your system to dump cores, and once you’ll
have coredump run

gdb /path/to/nginx /path/to/nginx.core

then in gdb:

bt

Depending on your OS procedure to enable core dumps is different,
but it may be simplified using nginx’s own global direcitives
worker_rlimit_core and working_directory, e.g.:

worker_rlimit_core 500m;
working_directory /path/to/corefiles;

nginx must have write access to ‘/path/to/corefiles’ directory.

Note well: it’s good idea to make sure your nginx binary isn’t
stripped (e.g. via file(1) command).

If you are unable to reproduce segmentation fault and therefore
unable to obtain coredump - it’s still good idea to provide nginx
-V output and your config. There is a chance that segmentation
fault you’ve seen was already fixed or caused by known bad
configuration.

In nginx 0.8.34 I’m currently aware of at least 4 possible
segfaults: 2 caused by bugs (in fastcgi stderr handling and in
subrequest loop handling; patches are available) and 2 caused by
known bad configurations (error_page 400 redirected to named
location, “if” usage as outlined in IfIsEvil wiki page). Not even
talking about older versions.

Maxim D.