As the author of SCGI (http://www.zedshaw.com/projects/scgi_rails/) I
can tell you that this is possible, but you have to be careful not to
step on anyone’s toes. As David V. mentioned previously it’s
pretty pointless in a plain CGI. In FastCGI, SCGI, or mod_ruby it’s
viable but tricky.
Ruby threads are implemented using a select IO loop, so they’re
fairly lightweight. You can also use Ruby’s fork to give yourself
N:M threading if you want multiple processors to handle more requests
or if you want more than select can handle. Generally the Ruby
threading model is good but is a little weird in places.
From the SCGI perspective you want to avoid the following:
- Leaving IO objects open after you’re done with your request. It
will eventually get cleaned out (and I’m working on cleaning more),
but it’s a total waste. Be good and keep your house clean.
- Putting stuff in the thread local store. This is generally OK in
your own threads, but again this stuff doesn’t go away until the GC
collects it which in Ruby means quite a while. If you seem to get
weird leaks then investigate this.
- Wildly making calls to Rails functions inside the threads. Rails
isn’t very thread safe, in fact I’d say it’s not thread safe at all.
If you’re running multiple threads that try to hit AR for example,
then you’ll get tons of DB connections since AR by default uses
thread local storage for it’s DB connection pooling. Other examples
are how modules get loaded–if a model or controller isn’t loaded
when your thread fires off and you try to use it then Ruby throws a fit.
- Expecting threads to solve problems in your workflow design.
Typically I see people who want to use threads to offload some
processing in the background so that the user can continue using the
web app while their big work is being done. What invariably ends up
happening is the background thread doesn’t run reliably and causes
major havoc with the rest of the application’s processing. This is
especially true if you use fork or exec. A better approach is to
either redesign the workflow to not need this or write a separate DRb
server that runs stand-alone and processes these requests. The DRb
solution is also really powerful since you can then offload the
processing to tons of other machines if you need.
Anyway, good luck on it. Feel free to hit me up if you have thread
Zed A. Shaw