Hi I'm trying to setup nginx to proxy a server sent events connection (http://dev.w3.org/html5/eventsource/) to a backend server. The approach is the browser connects to a particular path, which then checks the cookies to see the connection is authorised, and then returns an X-Accel-Redirect header to connect up to a separate internal path to do the proxying. That separate internal path is configured like this: location ^~ /pushevents/ { internal; # Immediately send backend responses back to client proxy_buffering off; # Disable keepalive to browser keepalive_timeout 0; # It's a long lived backend connection with potentially a long time between # push events, make sure proxy doesn't timeout proxy_read_timeout 7200; proxy_pass http://pushbackend; } The backend service listening on pushbackend then holds the connection open, and returns data down the connection whenever a push event occurs, which nginx proxies straight back to the browser because buffering is turned off. One important thing here is that if the client drops the connection to nginx, nginx should also drop the connection to the backend server. This is done by making sure we do not change proxy_ignore_client_abort, which defaults to off. Testing this out, it all works fine for http browser connections, but appears to be broken for https browser connections. When using http with my test client, I see this in the nginx log when the client disconnects: 2013/03/14 23:27:27 [info] 27717#0: *1 client prematurely closed connection, so upstream connection is closed too while reading upstream, client: 10.50.1.14, server: www.*, request: "GET /events/ HTTP/1.0", upstream: "http://127.0.0.4:80/.../", host: "..." And when I check netstat, I see that the connection from nginx -> pushbackend has been dropped as well. However for https connections, this is not what happens. Instead, it appears nginx fails to detect that the client has closed the connection, and leaves the nginx -> pushbackend connection open (confirmed with netstat). And because we set proxy_read_timeout to 2 hours, it takes 2 hours for that connection to be closed! That's bad. This is nginx 1.2.7 compiled with openssl 1.0.1e. With an http connection with debugging enabled, when I close the client connection I see this in the log: 2013/03/15 00:11:33 [debug] 28285#0: epoll: fd:3 ev:0005 d:00007F6B414AA150 2013/03/15 00:11:33 [debug] 28285#0: *1 http run request: "..." 2013/03/15 00:11:33 [debug] 28285#0: *1 http upstream check client, write event:0, "..." 2013/03/15 00:11:33 [debug] 28285#0: *1 http upstream recv(): 0 (11: Resource temporarily unavailable) 2013/03/15 00:11:33 [info] 28285#0: *1 client prematurely closed connection, so upstream connection is closed too while reading upstream, client: 10.50.1.14, server: insecure.*, request: "GET /events/ HTTP/1.0", upstream: "http://127.0.0.4/.../", host: "..." 2013/03/15 00:11:33 [debug] 28285#0: *1 finalize http upstream request: 499 2013/03/15 00:11:33 [debug] 28285#0: *1 finalize http proxy request 2013/03/15 00:11:33 [debug] 28285#0: *1 free rr peer 1 0 2013/03/15 00:11:33 [debug] 28285#0: *1 close http upstream connection: 32 2013/03/15 00:11:33 [debug] 28285#0: *1 free: 0000000001847BC0, unused: 48 2013/03/15 00:11:33 [debug] 28285#0: *1 event timer del: 32: 1363327885296 2013/03/15 00:11:33 [debug] 28285#0: *1 reusable connection: 0 2013/03/15 00:11:33 [debug] 28285#0: *1 http output filter "..." 2013/03/15 00:11:33 [debug] 28285#0: *1 http copy filter: "..." 2013/03/15 00:11:33 [debug] 28285#0: *1 http postpone filter "..." 00007FFF614BAFE0 2013/03/15 00:11:33 [debug] 28285#0: *1 http copy filter: -1 "..." 2013/03/15 00:11:33 [debug] 28285#0: *1 http finalize request: -1, "..." a:1, c:1 2013/03/15 00:11:33 [debug] 28285#0: *1 http terminate request count:1 2013/03/15 00:11:33 [debug] 28285#0: *1 http terminate cleanup count:1 blk:0 2013/03/15 00:11:33 [debug] 28285#0: *1 http posted request: "..." 2013/03/15 00:11:33 [debug] 28285#0: *1 http terminate handler count:1 2013/03/15 00:11:33 [debug] 28285#0: *1 http request count:1 blk:0 2013/03/15 00:11:33 [debug] 28285#0: *1 http close request 2013/03/15 00:11:33 [debug] 28285#0: *1 http log handler 2013/03/15 00:11:33 [debug] 28285#0: *1 free: 0000000001952800 2013/03/15 00:11:33 [debug] 28285#0: *1 free: 0000000001895EA0 2013/03/15 00:11:33 [debug] 28285#0: *1 free: 0000000001846BB0, unused: 0 2013/03/15 00:11:33 [debug] 28285#0: *1 free: 0000000001894E90, unused: 3 2013/03/15 00:11:33 [debug] 28285#0: *1 free: 0000000001896EB0, unused: 1771 2013/03/15 00:11:33 [debug] 28285#0: *1 close http connection: 3 2013/03/15 00:11:33 [debug] 28285#0: *1 reusable connection: 0 2013/03/15 00:11:33 [debug] 28285#0: *1 free: 00000000018467A0 2013/03/15 00:11:33 [debug] 28285#0: *1 free: 00000000018461A0 2013/03/15 00:11:33 [debug] 28285#0: *1 free: 000000000183B210, unused: 8 2013/03/15 00:11:33 [debug] 28285#0: *1 free: 0000000001846690, unused: 128 2013/03/15 00:11:33 [debug] 28285#0: timer delta: 3247 2013/03/15 00:11:33 [debug] 28285#0: posted events 0000000000000000 2013/03/15 00:11:33 [debug] 28285#0: worker cycle 2013/03/15 00:11:33 [debug] 28285#0: epoll timer: -1 When doing the same thing with an https connection (openssl s_client) and closing the connection I see this in the log: 2013/03/15 00:10:23 [debug] 28237#0: epoll: fd:3 ev:0005 d:00007F6ECDCAE150 2013/03/15 00:10:23 [debug] 28237#0: *1 http run request: "..." 2013/03/15 00:10:23 [debug] 28237#0: *1 http upstream check client, write event:0, "..." 2013/03/15 00:10:23 [debug] 28237#0: *1 http upstream recv(): 1 (11: Resource temporarily unavailable) 2013/03/15 00:10:23 [debug] 28237#0: *1 http run request: "..." 2013/03/15 00:10:23 [debug] 28237#0: *1 http upstream process non buffered downstream 2013/03/15 00:10:23 [debug] 28237#0: *1 event timer del: 32: 1363327814331 2013/03/15 00:10:23 [debug] 28237#0: *1 event timer add: 32: 7200000:1363327823956 2013/03/15 00:10:23 [debug] 28237#0: timer delta: 4626 2013/03/15 00:10:23 [debug] 28237#0: posted events 0000000000000000 2013/03/15 00:10:23 [debug] 28237#0: worker cycle 2013/03/15 00:10:23 [debug] 28237#0: epoll timer: 7200000 Is this an nginx bug? -- Rob Mueller robm@fastmail.fm
on 2013-03-15 05:23
on 2013-03-15 09:11
Hello! On Fri, Mar 15, 2013 at 03:22:31PM +1100, Robert Mueller wrote: [...] > > However for https connections, this is not what happens. Instead, it > appears nginx fails to detect that the client has closed the connection, > and leaves the nginx -> pushbackend connection open (confirmed with > netstat). And because we set proxy_read_timeout to 2 hours, it takes 2 > hours for that connection to be closed! That's bad. You shouldn't rely on connection close being detected by nginx. It's generally useful optimization, but it's not something which is guaranteed to work. It is generally not possible to check if the connection was closed if there a pending data in it. One have to read all pending data before connection close can be detected, but it doesn't work as a generic detection mechanism as one have to buffer read data somewhere. As of now, nginx uses: 1) A EV_EOF flag as reported by kqueue. This only works if you use kqueue, i.e. on FreeBSD and friends. 2) The recv(MSG_PEEK) call to test a case when connection was closed. This works on all platforms, but only if there are no pending data. In case of https, in many (most) cases there are pending data - due to various SSL packets send during connection close. This means connection close detection with https doesn't work unless you use kqueue. Further reading: http://mailman.nginx.org/pipermail/nginx/2011-June... http://mailman.nginx.org/pipermail/nginx/2011-Nove... -- Maxim Dounin http://nginx.org/en/donation.html
on 2013-03-15 23:33
> In case of https, in many (most) cases there are pending data - > due to various SSL packets send during connection close. This > means connection close detection with https doesn't work unless > you use kqueue. > > Further reading: > > http://mailman.nginx.org/pipermail/nginx/2011-June... > http://mailman.nginx.org/pipermail/nginx/2011-Nove... These reports appear to relate to SSL upstream connections (both refer to ngx_http_upstream_check_broken_connection). I'm talking about an SSL client connection, with a plain http upstream connection. When an https client drops it's connection, the upstream http proxy connection is not dropped. If nginx can't detect an https client disconnect properly, that must mean it's leaking connection information internally doesn't it? Rob
on 2013-03-17 00:49
Hello! On Sat, Mar 16, 2013 at 09:32:27AM +1100, Robert Mueller wrote: > > These reports appear to relate to SSL upstream connections (both refer > to ngx_http_upstream_check_broken_connection). I'm talking about an SSL > client connection, with a plain http upstream connection. Both are about client connections. The ngx_http_upstream_check_broken_connection() function is here to check if client is disconnected or not. > When an https client drops it's connection, the upstream http proxy > connection is not dropped. If nginx can't detect an https client > disconnect properly, that must mean it's leaking connection information > internally doesn't it? No. It just can't say if a connection was closed or not as there are pending data in the connection, and it can't read data (there may be a pipelined request). Therefore in this case, being on the safe side, it assumes the connection isn't closed and doesn't try to abort upstream request. -- Maxim Dounin http://nginx.org/en/donation.html
on 2013-03-19 05:45
> > When an https client drops it's connection, the upstream http proxy > > connection is not dropped. If nginx can't detect an https client > > disconnect properly, that must mean it's leaking connection information > > internally doesn't it? > > No. It just can't say if a connection was closed or not as there > are pending data in the connection, and it can't read data (there > may be a pipelined request). Therefore in this case, being on the > safe side, it assumes the connection isn't closed and doesn't try > to abort upstream request. Oh right I see now. So the underlying problem is that the nginx stream layer abstraction isn't clean enough to handle low level OS events and then map them through the SSL layer to read/write/eof conceptual events as needed. Instead you need an OS level "eof" event, which you then assume maps through the SSL abstraction layer to a SSL stream eof event. Ok, so I had a look at the kqueue eof handling, and what's needed for epoll eof handling, and created a quick patch that seems to work. Can you check this out, and see if it looks right. If so, any chance you can incorporate it upstream? http://robm.fastmail.fm/downloads/nginx-epoll-eof.patch If there's anything you want changed, let me know and I'll try and fix it up. Rob
on 2013-03-19 15:55
Hello! On Tue, Mar 19, 2013 at 03:45:10PM +1100, Robert Mueller wrote: > > to abort upstream request. > > Oh right I see now. > > So the underlying problem is that the nginx stream layer abstraction > isn't clean enough to handle low level OS events and then map them > through the SSL layer to read/write/eof conceptual events as needed. > Instead you need an OS level "eof" event, which you then assume maps > through the SSL abstraction layer to a SSL stream eof event. Not exactly. The underlying problem is that BSD sockets API doesn't provide standard means to detect EOF without reading all pending data, and hence OS-specific extensions have to be used to reliably detect pending EOFs. > Ok, so I had a look at the kqueue eof handling, and what's needed for > epoll eof handling, and created a quick patch that seems to work. > > Can you check this out, and see if it looks right. If so, any chance you > can incorporate it upstream? > > http://robm.fastmail.fm/downloads/nginx-epoll-eof.patch > > If there's anything you want changed, let me know and I'll try and fix > it up. I don't really like what you did in ngx_http_upstream.c. And there are more places where ev->pending_eof is used, and probably at least some of these places should be adapted, too. Additionally, poll_ctl(2) manpage claims the EPOLLRDHUP only available since Linux 2.6.17, and this suggests it needs some configure tests/conditional compilation. Valentin is already worked on this, and I believe he'll be able to provide a little bit more generic patch. -- Maxim Dounin http://nginx.org/en/donation.html
on 2013-03-20 02:08
> Valentin is already worked on this, and I believe he'll be able to > provide a little bit more generic patch. Ok, well I might just use ours for now, but won't develop it any further. Any idea on a time frame for this more official patch? Rob
Please log in before posting. Registration is free and takes only a minute.
Existing account
(Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
Log in with Google account | Log in with Yahoo account
No account? Register here.