I've read that "threading is considered harmful" for Ruby web apps.
Well, I'm writing a Sinatra app which will build a page based on the
responses of several servers (Net::HTTP.get). I want to do these .gets
in parallel, as doing them synchronously would obviously mean the users
would wait for a long time.
Would it be "considered harmful" to do:
resp_a, resp_b, resp_c = nil
thread_a = Thread.new { resp_a = Net::HTTP.get site_a }
thread_b = Thread.new { resp_b = Net::HTTP.get site_b }
thread_c = Thread.new { resp_c = Net::HTTP.get site_c }
thread_a.join
thread_b.join
thread_c.join
Is there any possible harm that could come from this? Can threading
interfere with Rack in some way? I haven't done much previous
development of threaded apps, so I would appreciate any tips.
on 2010-03-08 23:50
on 2010-03-09 00:14
I've used threading with getting several web pages for a long time and I've never had any problem so long as you catch errors if a specifc web page can't be obtained. Tom Reilly
on 2010-03-09 00:53
On Mar 8, 2010, at 3:13 PM, Tom Reilly wrote: >> Would it be "considered harmful" to do: >> interfere with Rack in some way? I haven't done much previous >> development of threaded apps, so I would appreciate any tips. IMHO you would be better served using the 'thin' rack comat web server and using its async mode along with EM::HTTP::Request. This way you could use event driven style to hve zero threads and basically pause the clients request connection while you make async calls to all the other web services, once they all return then you fire the async callback for thin to resume the clients connection and return the results. Doing it this way will require a bit more mental twisting to get all the async stuff correct but it will be far more scalable and will serve you much better in the end. Cheers- Ezra Zygmuntowicz ez@engineyard.com
on 2010-03-09 01:03
On Mon, Mar 8, 2010 at 10:50 PM, Nick Brown <nick@nick-brown.com> wrote: > I've read that "threading is considered harmful" for Ruby web apps. > Well, I'm writing a Sinatra app which will build a page based on the > responses of several servers (Net::HTTP.get). I want to do these .gets > in parallel, as doing them synchronously would obviously mean the users > would wait for a long time. > There are some historical reasons behind threading == harmful (defaults for Rails, GIL & native gems, and a general lack of robustness in older Ruby thread implementations). > Is there any possible harm that could come from this? Can threading > interfere with Rack in some way? I haven't done much previous > development of threaded apps, so I would appreciate any tips. > I believe Sinatra/Rack is thread safe, so you should be fine on that count. Whats more important is that this model isn't exactly a good architecture. You are spawning a lot of threads per request and you have no real external oversight into how they are working. You can't send back your response until you have received all your outbound responses and you are particularly vulnerable to timeouts - in particular the client browser can timeout your request, while you are still waiting on responses to outbound connections. You see a lot of solutions that use process level concurrency (BackgroundRb, DelayedJob etc) but most web solutions that aggregate content from multiple sites (i.e. mashups) do it all in the browser, with some cross site scripting & javascript. Technically I dont see too many issues with the multi-threaded approach you propose for smaller requests, but you will want to set an aggressive timeout on the outbound requests.
on 2010-03-09 05:34
Tom Reilly: > I've never had any problem... Awesome! Good to hear :-) Ezra Zygmuntowicz: > you would be better served using the 'thin' rack comat web > server and using its async mode along with EM::HTTP::Request. Thanks. I've been using Apache+Passenger because that's what I know, but I will investigate Thin if it is indeed more scalable. Are you referring to RAM usage when you say it's more scalable? Richard Conroy: > javascript ... you will want to set an aggressive timeout This must happen server-side. But you're right about the timeouts. And some searching has revealed Timeout::timeout() to me! It would appear that: resp_a = nil thread_a = Thread.new{ Timeout::timeout(4){ resp_a = Net::HTTP.get site_a }} ... thread_a.join will do what I need, so long as I catch exceptions, too. And again, I'm still open to other suggestions if anyone else has any!
on 2010-03-09 10:46
Or this slightly shorter version:
thread_a = Thread.new { Net::HTTP.get site_a }
thread_b = Thread.new { Net::HTTP.get site_b }
thread_c = Thread.new { Net::HTTP.get site_c }
val1 = thread_a.value
val2 = thread_b.value
val3 = thread_c.value
Please log in before posting. Registration is free and takes only a minute.
Existing account
(Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
Log in with Google account | Log in with Yahoo account
No account? Register here.