Threads vs. Processes and Rails

Hey all,

I need to launce a rather lengthy process from a small Rails
application.
It’s the sort of thing (sending out an email newsletter) that would need
to
be launched from a browser window, but the browser doesn’t need to stick
around to see it through.

I played around with Thread.new, but I think, based on limited testing,
that
the threads need to complete before the view gets rendered. So now I’m
playing with the system() method. I want to maintain the ActiveRecord
goodness of the rails app for building the list of recipients out of our
database.

So, anyone have any thoughts on the best practice for this sort of thing
(leaving aside whether or not sending email newsletters is a good idea
anyway - not my choice).

Thanks in advance,

-Ron

i’d consider making an entry in a table during the request and having a
second, asynchronous task sweep the table to send out the newsletters.

i’d consider making an entry in a table during the request and having a
second, asynchronous task sweep the table to send out the newsletters.

This is the approach we use for all our long-running requests. Such
as, say, generate a personalized version of Getting Real when a
purchase has been made.

David Heinemeier H.
http://www.loudthinking.com – Broadcasting Brain
http://www.basecamphq.com – Online project management
http://www.backpackit.com – Personal information manager
http://www.rubyonrails.com – Web-application framework

On May 9, 2006, at 1:38 PM, Sean B. wrote:

need to work with the generated child data straight away so its OK
for the application to chug away generating these. If it didn’t go
against my personal belief system I’d advocate a database trigger.

Any thoughts on this?

Hey Sean-

I have been working on this same problem for a few apps I am

building. I have come up with a little thing I call BackgrounDRb. Its
a small drb server with start and stop scripts that you can offload
long running tasks to from your rails app. It even has support for
progress bars via ajax if you wanted to show something like that. I
haven’t released it yet but if you wanted to test it out you could
give me a shout off list and I will send it to you.

The current way this works is you have a backgroundrb folder in your

RAILS_ROOT/scripts dir. In this dir there are start and stop scripts
for starting and stopping your drb server. The main drb object is a
front proxy type of class. It basically holds a hash of keys pointing
to long running tasks like ( job_key => running_worker_task }. Then
you need a folder in RAILS_ROOT/lib/workers. Any classes you drop in
this dir are available to your drb server for running long tasks
without tying up rails at all.

Here are a few screencasts of the proof of concept:

http://brainspl.at/drb_progress.mov
http://brainspl.at/drb_ajax_tail.mov

Here is an example worker class I used to make the progressbar demo:

class Worker
include DRbUndumped

 attr_accessor :text

 def initialize(options)
   @progress = 0
   @text = options[:text]
   start_working
 end

 def start_working
   # Fake work loop here to demo progress bar.
   Thread.new do
     while @progress < 100
       sleep rand / 2
       a = [1,3,5,7]
       @progress += a[rand(a.length-1)]
       if @progress > 100
         @progress = 100
       end
     end
   end
 end

 def progress
   puts "Rails is fetching progress: #{@progress}"
   Integer(@progress)
 end

end

Then you can use this to make a progress bar from rails like this,

You first setup the Proxy object in you8r environment.rb file so its
available everywhere in rails. Then you can start your worker class
with this syntax.

   session[:job_key] = Proxy.new_worker(:worker, :text => 'this

text has been sent to the worker.')

What that does is takes the :worker symbol and looks for a class in

lib/workers/ called Worker. It then will instantiate it with
the :Text hash as args like this:

Worker.new( :text => ‘this text has been sent to the worker.’)

So you can drop any worker class you have made in lib/workers and it

will be available to you through the drb Proxy class. So to come back
later and retrieve the progress of our Worker class we use this syntax:

   progress = Proxy.get_worker(session[:job_key]).progress
   render :update do |page|
     page.replace_html('progress',
                       "<h3>#{progress}% done</h3>" +
                       "<img src='/images/progress.gif' width='#

{progress * 2}’ height=‘15’ />")
if progress == 100
page.redirect_to :action => ‘done’
end
end

When we call Proxy.get_worker(session[:job_key]) we get a reference

to the remote worker object we instantiated previously. Then its just
as easy as calling the progress method of that object.

This technique has many uses. You can use it as an application wide

storage or context. Or you could use it to cache compute heavy items
for use in other pages without recomputing. Or you can make long
running background tasks with or without progress bars.

The cool thing is that since its one drb server, it can be used from

as many boxes or fcgi processes you may have. So you don’t need to
worry if you request your object back from a seperate fcgi then the
one that created it.

I am almost done with it so I will release it soon. Maybe it can do

what you need.

Cheers-
-Ezra

When you say ‘asynchronous task’ do you mean a controller action called
by
AJAX from the browser or something server side? In Java we would have a
timer class to monitor the queue table … is there anything similar in
Ruby
on Rails?

We are about to kick start a project in which we might conceivably need
to
generate 8,500 rows of data from one user driven event. I like the idea
of
limiting this to one process that monitors a queued_actions table for
work
to do rather than bombarding all our SCGI processes with these long
running
processes. The users don’t need to work with the generated child data
straight away so its OK for the application to chug away generating
these.
If it didn’t go against my personal belief system I’d advocate a
database
trigger.

Any thoughts on this?