Dropped https client connection doesn't drop backend proxy_pass connection

Hi

I’m trying to setup nginx to proxy a server sent events connection
(http://dev.w3.org/html5/eventsource/) to a backend server.

The approach is the browser connects to a particular path, which then
checks the cookies to see the connection is authorised, and then returns
an X-Accel-Redirect header to connect up to a separate internal path to
do the proxying. That separate internal path is configured like this:

location ^~ /pushevents/ {
  internal;

  # Immediately send backend responses back to client
  proxy_buffering off;
  # Disable keepalive to browser
  keepalive_timeout 0;
  # It's a long lived backend connection with potentially a long
  time between
  #  push events, make sure proxy doesn't timeout
  proxy_read_timeout 7200;

  proxy_pass http://pushbackend;
}

The backend service listening on pushbackend then holds the connection
open, and returns data down the connection whenever a push event occurs,
which nginx proxies straight back to the browser because buffering is
turned off.

One important thing here is that if the client drops the connection to
nginx, nginx should also drop the connection to the backend server. This
is done by making sure we do not change proxy_ignore_client_abort, which
defaults to off.

Testing this out, it all works fine for http browser connections, but
appears to be broken for https browser connections.

When using http with my test client, I see this in the nginx log when
the client disconnects:

2013/03/14 23:27:27 [info] 27717#0: 1 client prematurely closed
connection, so upstream connection is closed too while reading upstream,
client: 10.50.1.14, server: www.
, request: “GET /events/ HTTP/1.0”,
upstream: “http://127.0.0.4:80/.../”, host: “…”

And when I check netstat, I see that the connection from nginx →
pushbackend has been dropped as well.

However for https connections, this is not what happens. Instead, it
appears nginx fails to detect that the client has closed the connection,
and leaves the nginx → pushbackend connection open (confirmed with
netstat). And because we set proxy_read_timeout to 2 hours, it takes 2
hours for that connection to be closed! That’s bad.

This is nginx 1.2.7 compiled with openssl 1.0.1e.

With an http connection with debugging enabled, when I close the client
connection I see this in the log:

2013/03/15 00:11:33 [debug] 28285#0: epoll: fd:3 ev:0005
d:00007F6B414AA150
2013/03/15 00:11:33 [debug] 28285#0: *1 http run request: “…”
2013/03/15 00:11:33 [debug] 28285#0: *1 http upstream check client,
write event:0, “…”
2013/03/15 00:11:33 [debug] 28285#0: *1 http upstream recv(): 0 (11:
Resource temporarily unavailable)
2013/03/15 00:11:33 [info] 28285#0: 1 client prematurely closed
connection, so upstream connection is closed too while reading upstream,
client: 10.50.1.14, server: insecure.
, request: “GET /events/
HTTP/1.0”, upstream: “http://127.0.0.4/.../”, host: “…”
2013/03/15 00:11:33 [debug] 28285#0: *1 finalize http upstream request:
499
2013/03/15 00:11:33 [debug] 28285#0: *1 finalize http proxy request
2013/03/15 00:11:33 [debug] 28285#0: *1 free rr peer 1 0
2013/03/15 00:11:33 [debug] 28285#0: *1 close http upstream connection:
32
2013/03/15 00:11:33 [debug] 28285#0: *1 free: 0000000001847BC0, unused:
48
2013/03/15 00:11:33 [debug] 28285#0: *1 event timer del: 32:
1363327885296
2013/03/15 00:11:33 [debug] 28285#0: *1 reusable connection: 0
2013/03/15 00:11:33 [debug] 28285#0: *1 http output filter “…”
2013/03/15 00:11:33 [debug] 28285#0: *1 http copy filter: “…”
2013/03/15 00:11:33 [debug] 28285#0: *1 http postpone filter “…”
00007FFF614BAFE0
2013/03/15 00:11:33 [debug] 28285#0: *1 http copy filter: -1 “…”
2013/03/15 00:11:33 [debug] 28285#0: *1 http finalize request: -1, “…”
a:1, c:1
2013/03/15 00:11:33 [debug] 28285#0: *1 http terminate request count:1
2013/03/15 00:11:33 [debug] 28285#0: *1 http terminate cleanup count:1
blk:0
2013/03/15 00:11:33 [debug] 28285#0: *1 http posted request: “…”
2013/03/15 00:11:33 [debug] 28285#0: *1 http terminate handler count:1
2013/03/15 00:11:33 [debug] 28285#0: *1 http request count:1 blk:0
2013/03/15 00:11:33 [debug] 28285#0: *1 http close request
2013/03/15 00:11:33 [debug] 28285#0: *1 http log handler
2013/03/15 00:11:33 [debug] 28285#0: *1 free: 0000000001952800
2013/03/15 00:11:33 [debug] 28285#0: *1 free: 0000000001895EA0
2013/03/15 00:11:33 [debug] 28285#0: *1 free: 0000000001846BB0, unused:
0
2013/03/15 00:11:33 [debug] 28285#0: *1 free: 0000000001894E90, unused:
3
2013/03/15 00:11:33 [debug] 28285#0: *1 free: 0000000001896EB0, unused:
1771
2013/03/15 00:11:33 [debug] 28285#0: *1 close http connection: 3
2013/03/15 00:11:33 [debug] 28285#0: *1 reusable connection: 0
2013/03/15 00:11:33 [debug] 28285#0: *1 free: 00000000018467A0
2013/03/15 00:11:33 [debug] 28285#0: *1 free: 00000000018461A0
2013/03/15 00:11:33 [debug] 28285#0: *1 free: 000000000183B210, unused:
8
2013/03/15 00:11:33 [debug] 28285#0: *1 free: 0000000001846690, unused:
128
2013/03/15 00:11:33 [debug] 28285#0: timer delta: 3247
2013/03/15 00:11:33 [debug] 28285#0: posted events 0000000000000000
2013/03/15 00:11:33 [debug] 28285#0: worker cycle
2013/03/15 00:11:33 [debug] 28285#0: epoll timer: -1

When doing the same thing with an https connection (openssl s_client)
and closing the connection I see this in the log:

2013/03/15 00:10:23 [debug] 28237#0: epoll: fd:3 ev:0005
d:00007F6ECDCAE150
2013/03/15 00:10:23 [debug] 28237#0: *1 http run request: “…”
2013/03/15 00:10:23 [debug] 28237#0: *1 http upstream check client,
write event:0, “…”
2013/03/15 00:10:23 [debug] 28237#0: *1 http upstream recv(): 1 (11:
Resource temporarily unavailable)
2013/03/15 00:10:23 [debug] 28237#0: *1 http run request: “…”
2013/03/15 00:10:23 [debug] 28237#0: *1 http upstream process non
buffered downstream
2013/03/15 00:10:23 [debug] 28237#0: *1 event timer del: 32:
1363327814331
2013/03/15 00:10:23 [debug] 28237#0: *1 event timer add: 32:
7200000:1363327823956
2013/03/15 00:10:23 [debug] 28237#0: timer delta: 4626
2013/03/15 00:10:23 [debug] 28237#0: posted events 0000000000000000
2013/03/15 00:10:23 [debug] 28237#0: worker cycle
2013/03/15 00:10:23 [debug] 28237#0: epoll timer: 7200000

Is this an nginx bug?


Rob M.
[email protected]

Hello!

On Fri, Mar 15, 2013 at 03:22:31PM +1100, Robert Mueller wrote:

[…]

However for https connections, this is not what happens. Instead, it
appears nginx fails to detect that the client has closed the connection,
and leaves the nginx → pushbackend connection open (confirmed with
netstat). And because we set proxy_read_timeout to 2 hours, it takes 2
hours for that connection to be closed! That’s bad.

You shouldn’t rely on connection close being detected by nginx.
It’s generally useful optimization, but it’s not something which
is guaranteed to work.

It is generally not possible to check if the connection was closed
if there a pending data in it. One have to read all pending data
before connection close can be detected, but it doesn’t work as a
generic detection mechanism as one have to buffer read data
somewhere.

As of now, nginx uses:

  1. A EV_EOF flag as reported by kqueue. This only works if you
    use kqueue, i.e. on FreeBSD and friends.

  2. The recv(MSG_PEEK) call to test a case when connection was
    closed. This works on all platforms, but only if there are no
    pending data.

In case of https, in many (most) cases there are pending data -
due to various SSL packets send during connection close. This
means connection close detection with https doesn’t work unless
you use kqueue.

Further reading:

http://mailman.nginx.org/pipermail/nginx/2011-June/027672.html
http://mailman.nginx.org/pipermail/nginx/2011-November/030630.html


Maxim D.
http://nginx.org/en/donation.html

In case of https, in many (most) cases there are pending data -
due to various SSL packets send during connection close. This
means connection close detection with https doesn’t work unless
you use kqueue.

Further reading:

Is there a bug in ngx_http_upstream_module's broken connection check code?
Is there a bug in ngx_http_upstream_module's broken connection check code?

These reports appear to relate to SSL upstream connections (both refer
to ngx_http_upstream_check_broken_connection). I’m talking about an SSL
client connection, with a plain http upstream connection.

When an https client drops it’s connection, the upstream http proxy
connection is not dropped. If nginx can’t detect an https client
disconnect properly, that must mean it’s leaking connection information
internally doesn’t it?

Rob

Hello!

On Sat, Mar 16, 2013 at 09:32:27AM +1100, Robert Mueller wrote:

These reports appear to relate to SSL upstream connections (both refer
to ngx_http_upstream_check_broken_connection). I’m talking about an SSL
client connection, with a plain http upstream connection.

Both are about client connections. The
ngx_http_upstream_check_broken_connection() function is here to
check if client is disconnected or not.

When an https client drops it’s connection, the upstream http proxy
connection is not dropped. If nginx can’t detect an https client
disconnect properly, that must mean it’s leaking connection information
internally doesn’t it?

No. It just can’t say if a connection was closed or not as there
are pending data in the connection, and it can’t read data (there
may be a pipelined request). Therefore in this case, being on the
safe side, it assumes the connection isn’t closed and doesn’t try
to abort upstream request.


Maxim D.
http://nginx.org/en/donation.html

When an https client drops it’s connection, the upstream http proxy
connection is not dropped. If nginx can’t detect an https client
disconnect properly, that must mean it’s leaking connection information
internally doesn’t it?

No. It just can’t say if a connection was closed or not as there
are pending data in the connection, and it can’t read data (there
may be a pipelined request). Therefore in this case, being on the
safe side, it assumes the connection isn’t closed and doesn’t try
to abort upstream request.

Oh right I see now.

So the underlying problem is that the nginx stream layer abstraction
isn’t clean enough to handle low level OS events and then map them
through the SSL layer to read/write/eof conceptual events as needed.
Instead you need an OS level “eof” event, which you then assume maps
through the SSL abstraction layer to a SSL stream eof event.

Ok, so I had a look at the kqueue eof handling, and what’s needed for
epoll eof handling, and created a quick patch that seems to work.

Can you check this out, and see if it looks right. If so, any chance you
can incorporate it upstream?

http://robm.fastmail.fm/downloads/nginx-epoll-eof.patch

If there’s anything you want changed, let me know and I’ll try and fix
it up.

Rob

Hello!

On Tue, Mar 19, 2013 at 03:45:10PM +1100, Robert Mueller wrote:

to abort upstream request.

Oh right I see now.

So the underlying problem is that the nginx stream layer abstraction
isn’t clean enough to handle low level OS events and then map them
through the SSL layer to read/write/eof conceptual events as needed.
Instead you need an OS level “eof” event, which you then assume maps
through the SSL abstraction layer to a SSL stream eof event.

Not exactly. The underlying problem is that BSD sockets API
doesn’t provide standard means to detect EOF without reading all
pending data, and hence OS-specific extensions have to be used to
reliably detect pending EOFs.

Ok, so I had a look at the kqueue eof handling, and what’s needed for
epoll eof handling, and created a quick patch that seems to work.

Can you check this out, and see if it looks right. If so, any chance you
can incorporate it upstream?

http://robm.fastmail.fm/downloads/nginx-epoll-eof.patch

If there’s anything you want changed, let me know and I’ll try and fix
it up.

I don’t really like what you did in ngx_http_upstream.c. And
there are more places where ev->pending_eof is used, and probably
at least some of these places should be adapted, too.

Additionally, poll_ctl(2) manpage claims the EPOLLRDHUP only
available since Linux 2.6.17, and this suggests it needs some
configure tests/conditional compilation.

Valentin is already worked on this, and I believe he’ll be able to
provide a little bit more generic patch.


Maxim D.
http://nginx.org/en/donation.html

Hi everyone,

I have the same problem. I really need this feature. how is this going?

Maxim D.:
Valentin is already worked on this, and I believe he’ll be able to
provide a little bit more generic patch.

does this mean in the future we can use epoll to detect the client
connection’s abort?

Best Regards,


Lintao LI

I have the same problem. I really need this feature. how is this going?

Maxim D.:
Valentin is already worked on this, and I believe he’ll be able to
provide a little bit more generic patch.

does this mean in the future we can use epoll to detect the client
connection’s abort?

Yes, I haven’t heard in a while what the status of this is. I’m
currently using our existing patch, but would love to drop it and
upgrade when it’s included in nginx core…

Rob

Hello!

On Tue, Jul 09, 2013 at 12:52:24PM +1000, Robert Mueller wrote:

Yes, I haven’t heard in a while what the status of this is. I’m
currently using our existing patch, but would love to drop it and
upgrade when it’s included in nginx core…

As far as I can tell the state is roughly the same (though patch
in question improved a bit). Valentin?


Maxim D.
http://nginx.org/en/donation.html

Yes, I haven’t heard in a while what the status of this is. I’m
currently using our existing patch, but would love to drop it and
upgrade when it’s included in nginx core…

As far as I can tell the state is roughly the same (though patch
in question improved a bit). Valentin?

Still no update?

Any chance we can include the patch I developed in the short term, until
Valentin’s patch is ready? It definitely solves a real world issue for
us, and by the sounds of it, it would for some other people as well…

Rob

Valentin is already worked on this, and I believe he’ll be able to
provide a little bit more generic patch.

Ok, well I might just use ours for now, but won’t develop it any
further.

Any idea on a time frame for this more official patch?

Rob

We’re affected by this issue as well. Is there an open ticket where we
can track progress at http://trac.nginx.org/?

Hello!

On Tue, Jul 23, 2013 at 10:06:09AM +1000, Robert Mueller wrote:

Yes, I haven’t heard in a while what the status of this is. I’m
currently using our existing patch, but would love to drop it and
upgrade when it’s included in nginx core…

As far as I can tell the state is roughly the same (though patch
in question improved a bit). Valentin?

Still no update?

Valentin worked on this during his vacation, and submitted a patch
series for an internal review. It didn’t passed review though,
and needs more work.

Any chance we can include the patch I developed in the short term, until
Valentin’s patch is ready? It definitely solves a real world issue for
us, and by the sounds of it, it would for some other people as well…

I don’t think it’s a good idea. Your patch definetely has
interoperability problems and will break epoll for kernels before
2.6.17.


Maxim D.
http://nginx.org/en/donation.html

Ubuntu 12.04 + Nginx 1.4.7

I’m not 100% sure this is the same issue, but the symptoms are the same.
Nginx as SSL termination to an eventsource backend upstream. Nginx
active connections continuously go up until a restart.

I just upgraded to 1.6.0 to see if it resolves the issue.

Thanks for your response!

Does this also affect a ‘Connection Close’ header coming from the
upstream
service - being sent as a connect-keep-alive to the requesting client
thru
my nginx ssl reverse proxy?

(I am trying to setup nginx as an SSL proxy; when the upstream service
sends
a connection close I want this to be passed on by nginx to the client
requesting pages in the fronted.)

webbrowser → 443-nginx-ssl-offload → http-connection-upstream_service.
i am trying to use: 1.2.6 with a few patches.

I would truly appreciate your help/advice. I have an upstream_service on
the
same host that is sending a connection_close - and I think nginx is
sending
back a connection_keepalive out to the client webbrowser. Does this
issue
co-relate to the fix in 1.5.5 ?

Posted at Nginx Forum:

On Friday 16 May 2014 18:32:32 newnovice wrote:

I would truly appreciate your help/advice. I have an upstream_service on the
same host that is sending a connection_close - and I think nginx is sending
back a connection_keepalive out to the client webbrowser. Does this issue
co-relate to the fix in 1.5.5 ?

No, it’s not related, and there’s nothing to fix. The Connection header
is
hop-by-hop per RFC.

wbr, Valentin V. Bartenev

Can I proxy_pass on the way back also - and preserve all headers coming
from
the upstream service?

Posted at Nginx Forum:

On Tuesday 13 May 2014 00:46:37 Peter B. wrote:

We’re affected by this issue as well. Is there an open ticket where we
can track progress at http://trac.nginx.org/?

What version of nginx and what operating system do you use?

Nginx 1.5.5+ has been able to reliably detect close of connection
on linux 2.6.17+, see ticket: #320 (nginx should reliably check client connection close with pending data) – nginx

wbr, Valentin V. Bartenev

On Saturday 17 May 2014 11:28:33 newnovice wrote:

Can I proxy_pass on the way back also - and preserve all headers coming from
the upstream service?

I can try by using combination of add_header and $upstream_http_*
variables.

Also, some headers can be preserved by the proxy_pass_header directive.

wbr, Valentin V. Bartenev

I tried with: " proxy_pass_header Connection;" –
which
should do this according to documentation: “Permits passing otherwise
disabled header fields from a proxied server to a client.”
(from upstream_server to client)

But this did not pass thru the Connection header coming from the
upstream_server to the client of this proxy.

Valentin, can you please elaborate on how you suggest doing this: “I can
try
by using combination of add_header and $upstream_http_* variables.” ???

are you saying this ‘variable’ can be developed to pass on the
upstream_http_connection header in the response ???

Thanks for your help…

Valentin V. Bartenev Wrote:

directive.

wbr, Valentin V. Bartenev


nginx mailing list
[email protected]
nginx Info Page

Posted at Nginx Forum: