104: Connection reset by peer

ron_ramos · May 8, 2013, 1:30pm

Hi All,

I understand that this is a generic error, but it has been frustrating
trying to solve this issue and i’m not able to find an answer anywhere.
basically i have an application which is running fine using apache, but
we
wanted to try nginx/php5-fpm:

some parts of my application has this connection reset issue, sometimes
it
works but inconsistent.
if it does not work, i restart php5-fpm and it will work again, but
after
sometime it will have the same issue.

i’m using:

nginx -v

nginx version: nginx/1.4.0

php5-fpm -v

PHP 5.4.14-1~precise+1 (fpm-fcgi) (built: Apr 11 2013 17:18:51)
Copyright © 1997-2013 The PHP Group
Zend Engine v2.4.0, Copyright © 1998-2013 Zend Technologies

when i enable debug this is the only thing i can see;

[08-May-2013 18:08:32.545758] DEBUG: pid 358359, fpm_got_signal(), line
72:
received SIGCHLD
[08-May-2013 18:08:32.545884] WARNING: pid 358359, fpm_children_bury(),
line 252: [pool legacy] child 358423 exited with code 3 after 342.958315
seconds from start
[08-May-2013 18:08:32.548943] NOTICE: pid 358359, fpm_children_make(),
line
421: [pool legacy] child 359398 started
[08-May-2013 18:08:32.549023] DEBUG: pid 358359, fpm_event_loop(), line
411: event module triggered 1 events

i have tried different config changes like using static instead of
dynamic…increase max_request…increase child …increase server…etc.

one thing i really need is to identify what is causing that connection
peer
but logs is not really helping. i tried strace and it still did not show
me
anything.

any other way to debug or identify what is causing this issue? totally
clueless right now.

thank you in advanced.

Regards,
Ron

ron_ramos · May 8, 2013, 1:54pm

After a very long search on Google (almost 15s, including keyboard
input),
I found astonishing help, based on the information you provided.

About the FPM children burying, I found a resource on StackOverflow
linking
back to the Nginx forum (ML archive):

It seems, at first glance, that the children bury and its respawn works
as
intended, if you reach the requests limit number. I dunno how to check
that
is the case though. Your log entries seem to be silent about that.

My 2 cents,

B. R.

ron_ramos · May 8, 2013, 1:58pm

hi,

i’ve seen that info as well ( yes i tried searching for answers as
mentioned ) and it did not help me unfortunately.
i’ve increased from 500 to 1000 to 10000. increase children servers etc.

regards,
ron

ron_ramos · May 8, 2013, 2:12pm

Do you have some information on the number of concurrent connections?
Since you already played with the ‘max_requests’ parameter too, it seems
not to be the reason of the trouble. But better be safe than sorry.

If your actual number of connections/second is greater than what the
configuration is expecting then you’ll have your answer. But you don’t
provide information allowing to decide on this, and it seems you tried
blind changes only.
Could you provide your input requests rate?

B. R.

ron_ramos · May 8, 2013, 2:29pm

Hi All,

my apologies, but i think the issue is that php is crashing. was not
looking at the syslog:

[2617248.127349] php5-fpm[504727] general protection ip:6787a9
sp:7fff22c5c6f0 error:0 in php5-fpm[400000+700000]

gdb backtrace shows

#0 0x00000000006787a9 in ?? ()
#1 0x0000000000678940 in ?? ()
#2 0x00000000006ad82e in zend_hash_destroy ()
#3 0x000000000069e4ab in _zval_dtor_func ()
#4 0x000000000069031a in _zval_ptr_dtor ()
#5 0x00000000006ad808 in zend_hash_destroy ()
#6 0x000000000069e4ab in _zval_dtor_func ()
#7 0x000000000069031a in _zval_ptr_dtor ()
#8 0x00007fe9a25c056f in apc_free_class_entry_after_execution
(src=0x215e228) at /tmp/pear/temp/APC/apc_compile.c:1992
#9 0x00007fe9a25c3ad6 in apc_deactivate () at
/tmp/pear/temp/APC/apc_main.c:948
#10 apc_request_shutdown () at /tmp/pear/temp/APC/apc_main.c:1042
#11 0x00007fe9a25b85b5 in zm_deactivate_apc (type=,
module_number=) at /tmp/pear/temp/APC/php_apc.c:407
#12 0x00000000006a6d94 in ?? ()
#13 0x000000000063efd5 in php_request_shutdown ()
#14 0x000000000042d5b9 in ?? ()
#15 0x00007fe9a2dbe76d in __libc_start_main () from
/lib/x86_64-linux-gnu/libc.so.6
#16 0x000000000042e231 in _start ()

apc seems to be dying. will try to check what’s wrong. again my
apologies
on disturbing the list.

regards,
Ron

ron_ramos · May 8, 2013, 2:36pm

You were right to seek for answers somewhere else than configuration,
then… ;o)

Glad you found you answer.
I hope you’ll find your way around that crash.

B. R.

ron_ramos · May 8, 2013, 2:32pm

Hi B.R.

To answer your question i’m only sending max 10 connection to NGINX on
my
load balancer. and even if i took it out of the load balancer
and there is only me accessing the server, it still happens. thanks!

Regards,
Ron

104: Connection reset by peer

nginx -v

php5-fpm -v

​M​y 2 cents,

If your actual number of connections/second is greater than what the configuration is expecting then you’ll have your answer. But you don’t provide information allowing to decide on this, and it seems you tried blind changes only. Could you provide your input requests rate?

Glad you found you answer. I hope you’ll find your way around that crash.

My 2 cents,

If your actual number of connections/second is greater than what the
configuration is expecting then you’ll have your answer. But you don’t
provide information allowing to decide on this, and it seems you tried
blind changes only.
Could you provide your input requests rate?

Glad you found you answer.
I hope you’ll find your way around that crash.