Whenever I've tried using rack-timeout with JRuby I've run into hard-to-diagnose problems. I'm in the process of compiling my observations and theories, but first I just wanted to ask if anyone out there has been using rack-timeout with JRuby with good or bad results? Thanks! John
on 2015-01-21 18:17
on 2015-01-21 18:31
(replies inline) On Wed, 21 Jan 2015, John Joseph Bachir wrote: > Whenever I've tried using rack-timeout with JRuby I've run into > hard-to-diagnose problems. > > I'm in the process of compiling my observations and theories, but first I > just wanted to ask if anyone out there has been using rack-timeout with > JRuby with good or bad results? We've been using it at Lookout for a couple years now without any issues noticed. We do however rarely have misheving apps or downstream services that would cause us to invoke the rack-timeout behavior
on 2015-01-22 19:13
On 21-01-2015 15:16, John Joseph Bachir wrote: > Whenever I've tried using rack-timeout with JRuby I've run into > hard-to-diagnose problems. > > I'm in the process of compiling my observations and theories, but > first I just wanted to ask if anyone out there has been using > rack-timeout with JRuby with good or bad results? > > Thanks! > John I've once experienced an issue where Puma would stop responding to new requests although the Java tools seemed to indicate the application was working normally, it wouldn't simply accept new requests for some reason. It is a side application to generate XLS mostly through Apache POI while the main application runs on MRI and I guess exporting to Excel is not something our users often do so I was curious why every few months our monitoring service would tell us that application stopped responding so that we restarted it... There was no CPU usage for that application and I couldn't detect any active network connection to it either. But since I had no clue about it I decided to give rack-timeout a try. It turned out that the problem hasn't gone and I was clueless about why that was happening... After a lot of investigation I could finally reproduce the problem so that I could start debugging it. The problem happened when the client requested for some XLS but disconnected before getting the full output. That connection was then never returned to the Puma connections pool. I could then debug the issue and it turned out to be related to ActionController::Live which we use in our Rails application to serve the XLS using streaming. The implementation in Rails (by that time) would spawn a new thread which returned immediately and the rack-timeout middleware would only monitor the main request thread and not the spawned one... Here are some references in the case this might be happening to you: https://github.com/rails/rails/issues/16878 (explains the Rails bug which seems to be fixed in 4.2.0) https://github.com/puma/puma/issues/576 (explains how to reproduce it on Puma) The Puma issue is still open because it's tricky as Rack doesn't provide a good enough API for streaming so it's hard to detect such situation from a rack server implementation perspective... In case your application serves streamed responses you might be interested to dig in those issues and see if it might help you solving your real problem.