1.3.11 Issues?

aris · January 14, 2013, 11:50pm

Has anyone else seen problems with Nginx with 1.3.11? Everything works
perfectly fine with same compile options under 1.3.10… It’s a pretty
vanilla compile as far as modules… proxy, SSI, SSL and status modules.
Using SPDY patch (using the correct patch for each version of Nginx)
(and
the correct SPDY module compile option on 1.3.11)…

RIght right 1.3.11 starts being used, we get this stuff in the error
log:

*** Error in nginx: worker process': malloc(): memory corruption: 0x00000000009084f0 *** 2013/01/14 14:34:16 [alert] 16599#0: *137 getsockname() failed (9: Bad file descriptor) while SPDY processing, client: 174.70.187.121, server: dpstatic.com, request: "GET /j/tinymce/tiny_mce.js?_v=bba17b4a HTTP/1.1", host: "x.dpstatic.com", referrer: "https://dev.digitalpoint.com/threads/anybody-has-experience-with-ordering-at-answerserp-com-for-yahoo-backlinks.2624660/" 2013/01/14 14:34:16 [alert] 16599#0: *137 getsockname() failed (9: Bad file descriptor) while SPDY processing, client: 174.70.187.121, server: dpstatic.com, request: "GET /j/tinymce/tiny_mce.js?_v=bba17b4a HTTP/1.1", host: "x.dpstatic.com", referrer: "https://dev.digitalpoint.com/threads/anybody-has-experience-with-ordering-at-answerserp-com-for-yahoo-backlinks.2624660/" *** Error in nginx: worker process’: malloc(): memory corruption:
0x000000000088cc00 ***

Ended up going back to 1.3.10, and everything back to normal…

Posted at Nginx Forum:

digitalpoint · January 15, 2013, 12:01am

On Tuesday 15 January 2013 02:50:14 digitalpoint wrote:

2013/01/14 14:34:16 [alert] 16599#0: *137 getsockname() failed (9: Bad file
at-answerserp-com-for-yahoo-backlinks.2624660/" *** Error in `nginx: worker
process’: malloc(): memory corruption: 0x000000000088cc00 ***

Ended up going back to 1.3.10, and everything back to normal…

Could you show nginx -V ?

wbr, Valentin V. Bartenev

–

http://nginx.org/en/donation.html

digitalpoint · January 15, 2013, 12:16am

On Tuesday 15 January 2013 03:05:26 digitalpoint wrote:

–pid-path=/var/run/nginx.pid --error-log-path=/usr/log/ngnix/error.log
–http-log-path=/usr/log/ngnix/access.log
–with-openssl=/home/software_source/openssl-1.0.1c --with-cc-opt=‘-I
/usr/local/ssl/include’ --with-ld-opt=‘-L /usr/local/ssl/lib’
–without-http_proxy_module --without-http_ssi_module
–with-http_ssl_module --with-http_stub_status_module

The only difference with the compile options when I did it for 1.3.11 was I
added “–with-http_spdy_module” to the end since it needs it for 1.3.11+

Ok, thank you. It seems I already figured out where is the problem.

Please, try the new patch:
http://nginx.org/patches/spdy/patch.spdy-56_1.3.11.txt

wbr, Valentin V. Bartenev

–

http://nginx.org/en/donation.html

digitalpoint · January 15, 2013, 12:47am

Nope… still same issue.

nginx version: nginx/1.3.11
built by gcc 4.5.1 20101208 [gcc-4_5-branch revision 167585] (SUSE
Linux)
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx
–pid-path=/var/run/nginx.pid --error-log-path=/usr/log/ngnix/error.log
–http-log-path=/usr/log/ngnix/access.log
–with-openssl=/home/software_source/openssl-1.0.1c --with-cc-opt=‘-I
/usr/local/ssl/include’ --with-ld-opt=‘-L /usr/local/ssl/lib’
–without-http_proxy_module --without-http_ssi_module
–with-http_ssl_module
–with-http_stub_status_module --with-http_spdy_module

End of error.log:

*** Error in `nginx: worker process’: malloc(): memory corruption:
0x00000000009947f0 ***

Posted at Nginx Forum:

digitalpoint · January 15, 2013, 1:07am

On Tuesday 15 January 2013 03:46:43 digitalpoint wrote:

–without-http_proxy_module --without-http_ssi_module
–with-http_ssl_module --with-http_stub_status_module
–with-http_spdy_module

End of error.log:

*** Error in `nginx: worker process’: malloc(): memory corruption:
0x00000000009947f0 ***

Thank you for testing. Could you create a debug log for the issue?
See this link for instructions:
http://nginx.org/en/docs/debugging_log.html

wbr, Valentin V. Bartenev

–

http://nginx.org/en/donation.html

digitalpoint · January 15, 2013, 1:22am

Here ya go:

http://www.shawnhogan.com/error.log.gz

Posted at Nginx Forum:

digitalpoint · January 15, 2013, 2:23am

On Tuesday 15 January 2013 04:21:43 digitalpoint wrote:

Here ya go:

http://www.shawnhogan.com/error.log.gz

Thanks a lot. It was really helpful. I believe the problem is fixed now:
http://nginx.org/patches/spdy/patch.spdy-57_1.3.11.txt

wbr, Valentin V. Bartenev

–

http://nginx.org/en/donation.html

digitalpoint · January 15, 2013, 12:05am

ya sorry, was just coming back to follow up with that… after I posted
that, I was wondering why in the hell I was compiling with SSI module…
it’s actually WITHOUT SSI.

This is back on 1.3.10 after I rolled it back after the issues:

nginx version: nginx/1.3.10
built by gcc 4.5.1 20101208 [gcc-4_5-branch revision 167585] (SUSE
Linux)
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx
–pid-path=/var/run/nginx.pid --error-log-path=/usr/log/ngnix/error.log
–http-log-path=/usr/log/ngnix/access.log
–with-openssl=/home/software_source/openssl-1.0.1c --with-cc-opt=‘-I
/usr/local/ssl/include’ --with-ld-opt=‘-L /usr/local/ssl/lib’
–without-http_proxy_module --without-http_ssi_module
–with-http_ssl_module
–with-http_stub_status_module

The only difference with the compile options when I did it for 1.3.11
was I
added “–with-http_spdy_module” to the end since it needs it for 1.3.11+

Posted at Nginx Forum:

digitalpoint · January 15, 2013, 2:56am

Well… the underlying errors went away, but it seems the new SPDY patch
broke being able to handle multiple hosts on the same SPDY connection
now
(it worked under 1.3.10 just fine).

For example, we have a SSL cert for both digitalpoint.com and
dpstatic.com
(dpstatic.com is a cookieless domain for serving static content), so
SPDY
attempts to use the same connection for multiple hosts. See SPDY
session
list here:
http://f.cl.ly/items/0T1u3g0h0e1A0D1g2N0s/Image%202013.01.08%2011:59:48%20AM.png

With the SPDY patch for 1.3.11, now requests to *.dpstatic.com are
actually being sent to digitalpoint.com (and getting a file not
found).
So somehow during a SPDY connection, the host for an individual request
is
being ignored somewhere along the way.

Top browser is Chrome (SPDY connection), bottom browser is Safari (no
SPDY
support)… the end result is a SPDY connection will yield different
results
vs the “traditional” SSL connection:
http://f.cl.ly/items/3K1Q2N1I3B000c0b0614/Image%202013.01.14%205:52:31%20PM.png

Again, this worked as expected (ability for SPDY to properly share a
connection across multiple hosts) with 1.3.10.

Posted at Nginx Forum:

digitalpoint · January 15, 2013, 4:18am

On Tuesday 15 January 2013 05:55:43 digitalpoint wrote:

Again, this worked as expected (ability for SPDY to properly share a
connection across multiple hosts) with 1.3.10.

There is no difference between 1.3.10 and 1.3.11 in terms of SPDY.
In fact, 1.3.10 has serious bugs (see: http://nginx.org/en/CHANGES),
and you should use 1.3.11 instead.

The big difference is between spdy54 and spdy55+ patches. A large part
of
SPDY implementation was rewritten in spdy55, and also some relevant
parts
of nginx got new code.

One of those changes makes nginx more RFC 6066 compliant. Here is some
quotes:

Server Name Indication

[…]

If an application negotiates a server name using an application
protocol and then upgrades to TLS, and if a server_name extension is
sent, then the extension SHOULD contain the same name that was
negotiated in the application protocol. If the server_name is
established in the TLS session handshake, the client SHOULD NOT
attempt to request a different server name at the application layer.

11.1. Security Considerations for server_name

[…]

Since it is possible for a client to present a different server_name
in the application protocol, application server implementations that
rely upon these names being the same MUST check to make sure the
client did not present a different name in the application protocol.

from RFC 6066 - Transport Layer Security (TLS) Extensions: Extension Definitions

And you will not find in SPDY draft.2 specification any information
about
the “ability for SPDY to properly share a connection across multiple
hosts”:
http://dev.chromium.org/spdy/spdy-protocol/spdy-protocol-draft2

Apparently by making TLS SNI in nginx more RFC-compliant I
unintentionally
broke SPDY.

Well, it’s safe to use spdy54 with 1.3.11:
http://nginx.org/patches/spdy/patch.spdy-54.txt
and I recommend you to use it while I will think about a solution.

Thanks again for testing. I hope to fix the issue soon.

wbr, Valentin V. Bartenev

–

http://nginx.org/en/donation.html

digitalpoint · January 27, 2013, 4:22pm

On Tuesday 15 January 2013 07:50:30 digitalpoint wrote:

Yeah… the problem is that while it might not be part of the SPDY 2 draft
to share connections across multiple hosts, Chrome most certainly is doing
it (and probably other browsers) as you can see from the previous
screenshot.

Either way, you guys are doing a crazy awesome job… keep it up.

Please, try the new patch:
http://nginx.org/patches/spdy/patch.spdy-59_1.3.11.txt

The problem should be fixed now.

wbr, Valentin V. Bartenev

–

http://nginx.org/en/donation.html

digitalpoint · January 27, 2013, 11:01pm

So far so good… seems to be working fine.

Posted at Nginx Forum:

digitalpoint · January 28, 2013, 8:21pm

Valentin V. Bartenev Wrote:

Please, try the new patch:
http://nginx.org/patches/spdy/patch.spdy-59_1.3.11.txt

The problem should be fixed now.

wbr, Valentin V. Bartenev

Is there a possibility the patch introduced an issue where connections
don’t
expire (like ever)?

Our load balancer in front of web server shows 1,511 connections, Nginx
is
reporting 10,810 connections, and the number of connections as reported
by
Nginx is just growing and growing and does not even remotely coincide
with
actual traffic/users/connections. This only seemed to start after the
patch.

http://f.cl.ly/items/0u2E2U0P3L280l3X033s/Image%202013.01.28%2011:19:46%20AM.png

Posted at Nginx Forum:

digitalpoint · January 15, 2013, 4:51am

Yeah… the problem is that while it might not be part of the SPDY 2
draft
to share connections across multiple hosts, Chrome most certainly is
doing
it (and probably other browsers) as you can see from the previous
screenshot.

Either way, you guys are doing a crazy awesome job… keep it up.

Posted at Nginx Forum:

digitalpoint · January 28, 2013, 9:33pm

Valentin V. Bartenev Wrote:

actual traffic/users/connections. This only seemed to start after
nginx: donation

nginx Info Page
That does look like it’s the case…

2013/01/28 00:38:19 [alert] 30235#0: worker process 12408 exited on
signal
11
2013/01/28 02:00:24 [alert] 30235#0: worker process 15737 exited on
signal
11
2013/01/28 02:24:31 [alert] 30235#0: worker process 14897 exited on
signal
11
2013/01/28 02:25:19 [alert] 30235#0: worker process 22628 exited on
signal
11
2013/01/28 03:03:59 [alert] 30235#0: worker process 23528 exited on
signal
11
2013/01/28 03:17:58 [alert] 30235#0: worker process 7916 exited on
signal
11
2013/01/28 03:37:03 [alert] 30235#0: worker process 24767 exited on
signal
11
2013/01/28 04:02:07 [alert] 30235#0: worker process 23483 exited on
signal
11
2013/01/28 04:27:26 [alert] 30235#0: worker process 25164 exited on
signal
11
2013/01/28 05:41:34 [alert] 30235#0: worker process 19980 exited on
signal
11
2013/01/28 05:47:18 [alert] 30235#0: worker process 26566 exited on
signal
11
2013/01/28 08:39:35 [alert] 30235#0: worker process 30184 exited on
signal
11
2013/01/28 08:46:40 [alert] 30235#0: worker process 3482 exited on
signal
11
2013/01/28 09:42:16 [alert] 30235#0: worker process 27395 exited on
signal
11
2013/01/28 10:14:01 [alert] 30235#0: worker process 5439 exited on
signal
11
2013/01/28 10:28:36 [alert] 30235#0: worker process 29917 exited on
signal
11
2013/01/28 11:17:04 [alert] 30235#0: worker process 6869 exited on
signal
11

I guess I should toll back to the old SPDY patch…

Posted at Nginx Forum:

digitalpoint · January 28, 2013, 8:35pm

On Monday 28 January 2013 23:20:27 digitalpoint wrote:

Is there a possibility the patch introduced an issue where connections
don’t expire (like ever)?

Our load balancer in front of web server shows 1,511 connections, Nginx is
reporting 10,810 connections, and the number of connections as reported by
Nginx is just growing and growing and does not even remotely coincide with
actual traffic/users/connections. This only seemed to start after the
patch.

Do you mean numbers from stub_status? Then please check the error log.
It may also indicate periodically crashing worker processes.

wbr, Valentin V. Bartenev

–

http://nginx.org/en/donation.html