Ruby Thread in CGI application?


#1

Hello,

i heard that ruby’s Thread is not a native system thread but is
implemented in the interpreter itself. is that true? does this mean
that is is lighter than native threads? does anyone know if
Thread.start on every web request in my CGI application would severely
reduce my web server’s scalability?

thanks
konstantin


#2

ako… wrote:

Err. CGI already does start a new process to process each request,
spinning off a worker thread is superfluous.

David V.


#3

right. i mean FastCGI. a separate thread might be needed to execute a
code fragment with a safety level which is different from the safety
level of the rest of the CGI script.

konstantin


#4

Hi konstantin,

As the author of SCGI (http://www.zedshaw.com/projects/scgi_rails/) I
can tell you that this is possible, but you have to be careful not to
step on anyone’s toes. As David V. mentioned previously it’s
pretty pointless in a plain CGI. In FastCGI, SCGI, or mod_ruby it’s
viable but tricky.

Ruby threads are implemented using a select IO loop, so they’re
fairly lightweight. You can also use Ruby’s fork to give yourself
N:M threading if you want multiple processors to handle more requests
or if you want more than select can handle. Generally the Ruby
threading model is good but is a little weird in places.

From the SCGI perspective you want to avoid the following:

  1. Leaving IO objects open after you’re done with your request. It
    will eventually get cleaned out (and I’m working on cleaning more),
    but it’s a total waste. Be good and keep your house clean.
  2. Putting stuff in the thread local store. This is generally OK in
    your own threads, but again this stuff doesn’t go away until the GC
    collects it which in Ruby means quite a while. If you seem to get
    weird leaks then investigate this.
  3. Wildly making calls to Rails functions inside the threads. Rails
    isn’t very thread safe, in fact I’d say it’s not thread safe at all.
    If you’re running multiple threads that try to hit AR for example,
    then you’ll get tons of DB connections since AR by default uses
    thread local storage for it’s DB connection pooling. Other examples
    are how modules get loaded–if a model or controller isn’t loaded
    when your thread fires off and you try to use it then Ruby throws a fit.
  4. Expecting threads to solve problems in your workflow design.
    Typically I see people who want to use threads to offload some
    processing in the background so that the user can continue using the
    web app while their big work is being done. What invariably ends up
    happening is the background thread doesn’t run reliably and causes
    major havoc with the rest of the application’s processing. This is
    especially true if you use fork or exec. A better approach is to
    either redesign the workflow to not need this or write a separate DRb
    server that runs stand-alone and processes these requests. The DRb
    solution is also really powerful since you can then offload the
    processing to tons of other machines if you need.

Anyway, good luck on it. Feel free to hit me up if you have thread
problems.

Zed A. Shaw


#5

David V. wrote:

  1. On a multiple CPU system, where algorithms used in processing the
    request might benefit from parallelism running multiple processors.
    Though then again, you probably wouldn’t do CPU-intensive operations
    as these algorithms tend to be in Ruby anyway. As more realistic
    examples, asynchronous database / remote calls come to mind if you
    want to implement your own timeout policy to those to handle badly
    written third party libraries graciously to prevent your application
    from hanging up.
    Currently you won’t benefit from multiple CPUs as Ruby doesn’t use
    native threads.

– stefan


#6

On Sat, 14 Jan 2006 06:02:13 +0100, Stefan K. removed_email_address@domain.invalid wrote:

native threads.

– stefan

Slipped my mind. I was talking in general anyway. And doesn’t the
pthread
binding let you use native threads? (Didn’t really do anything with it
himself)

David V.


#7

David V. wrote:

from hanging up.

David V.

I don’t know enough about Ruby’s pthread support to have an informed
opinion on this question :frowning:

– stefan


#8

2006/1/14, Stefan K. removed_email_address@domain.invalid:

David V. wrote:

Slipped my mind. I was talking in general anyway. And doesn’t the
pthread binding let you use native threads? (Didn’t really do anything
with it himself)

I don’t know enough about Ruby’s pthread support to have an informed
opinion on this question :frowning:

AFAIK there is no support for any native threading in standard Ruby
1.x. The interpreter just isn’t designed for this. For web apps
Ruby’s threading might be sufficient in the general case because there
is a lot IO and multiplexing on IO should work find.

Btw, native thread support for me is one of the key improvements of Ruby
2.

Kind regards

robert


#9

That said, it might also be questionable just how useful it is to create
request processor threads even in FastCGI / SCGI, because that’s
precisely
the purpose of those. You’re very likely better off tweaking the
settings
of those to get what you might want done.

Now, a longer rant for the sake of completion: Using worker threads is
more viable, but that is only useful in certain scenarios, for example:

  1. In interactive long-running programs, where you want to maintain a
    responsive user interface while doing background work. This is very
    rarely the case with a web application, because you usually need to
    process each request in finite (and rather short) time.

  2. On a multiple CPU system, where algorithms used in processing the
    request might benefit from parallelism running multiple processors.
    Though
    then again, you probably wouldn’t do CPU-intensive operations as these
    algorithms tend to be in Ruby anyway. As more realistic examples,
    asynchronous database / remote calls come to mind if you want to
    implement
    your own timeout policy to those to handle badly written third party
    libraries graciously to prevent your application from hanging up.

Unless you have due cause to do so, do avoid multithreaded programming
or
keep it very simple, there’s just way too many ways to shoot yourself in
the foot.

David V.