[ANN] BackgrounDRb background task runner and Application Wi


#1

Friends-

I'm happy to annouce the first alpa release of BackgrounDRb. This is

a small framework for managing long running background tasks that
allows for ajax progress bars and more. It also serves as an
Application wide cache and context store for when you need something
like sessions but shared between users and multiple backend processes
like fcgi’s or mongrels. This MiddleMan runs in a drb(distributed
ruby) process that is separate from the rails process. This allows
for all backends to have one place to store data or launch jobs from.

You can see the proof of concept screencasts here:


And you can download the proof of concept rails app and run it

yourself from here:

http://opensvn.csie.org/ezra/rails/plugins/backgroundrb/

I'm looking for some folks to play with this and give feedback so I

can improve on it. So I appreciate any feedback from folks who try
this out for me.

Cheers-
-Ezra

Here is the README:

BackgrounDRb is a small framework for divorcing long running tasks from

Rails request/response cycle. With HTTP it is usually not a very good
idea to keep a request waiting for a response for long running actions.
BackgrounDRb also allows for status updates that in combination with
ajax can render live progress bars in the browser while the background
worker task gets completed. The MiddleMan can also be used as a cache.
You can store rendered templates or compute intensive results in the
MiddleMan for use later.

The MiddleMan drb server is a front controller or factory/delegate type
of object. It takes instructions from you and instantiates your worker
objects with the args you send from rails. It uses a hash to keep a key
pointing to a running worker that is also put into the session in rails
so railscan find the same Worker object that is running its job on
subsequent requests. The MiddleMan front object has a method that takes
a class type as a symbol and instantiates a new instance of said
class type. Then it returns a job_key to the client(rails). Rails can
then call a method on that object through the MiddleMan class.

There are many possible use cases for this system. More will be
implemented
soon. This is an open request for comments and feature requests.

The great thing about this framework is the fact that it creates a
shared
resource that is accessible from multiple backends like fcgi’s or
mongrel
processes. They each get a connection to the same MiddleMan object so
this
can be used as an application wide context as well as background process
runner that is shared across all users and all backends.

Let’s look at how this system works in detail.

Look at INSTALL for instructions on how to get everything set up and
working.

Lets look at a simple worker class.

class FooWorker
include DRbUndumped

def initialize(options={})
@progress = 0
@results = []
@options = options
start_working
end

def start_working
# Work loop goes inside a new thread so it doesn’t block
# rails while it works. A neat way to do progress bars in
# the browser is to have a @progress instance var that is
# initialized to 0 and then gets bumped up by your long
# running task. This way you can poll for the progress
# of your job via ajax and update a client side progress bar.
Thread.new do
# main work loop goes here. do work and update the
# progress bar instance var.
while something
@results << foo(@options)
@progress += 1
break if @progress > 99
end
end
end

def results
@results
end

def progress
puts “Rails is fetching progress: #{@progress}”
@progress
end
end

Your worker classes go into the RAILS_ROOT/lib/workers/ directory.
You can then use your worker class in rails like this:

in a controller

start new worker and put the job_key into the session so you can

get the status of your job later.

def background_task
session[:job_key] = MiddleMan.new_worker(:class => :foo_worker,
:args => {:baz =>
‘hello!’, :qux => 'another arg!})
end

def task_progress
if request.xhr?
progress = MiddleMan.get_worker(session[:job_key]).progress
render :update do |page|
page.replace_html(‘progress’,

#{progress}% done

” +
”)
if progress == 100
page.redirect_to :action => ‘results’
end
end
else
redirect_to :action => ‘index’
end
end

def results
@results = MiddleMan.get_worker(session[:job_key]).results
MiddleMan.delete_worker(session[:job_key])
end

Please note that when you use new_worker it takes a hash as the
argument.
the :class part of the hash is required so MiddleMan knows which
worker class to instantiate. You can give it either an underscore
version like :foo_worker or normal like :FooWorker. Also the :args key
points to a value that will be given to your worker class when
initialized.
The following will start a FooWorker class with a text argument of “Bar”

session[:job_key] = MiddleMan.new_worker(:class => :foo_worker,
:args => “Bar”)

In the background_task view you can use periodically_call_remote
to ping the task_progress method to get the progress of your job and
update
the progress bar. Once progress is equal to 100(or whatever you want)
you
redirect to the results page to display the results of the worker task.

There are a few simple examples in the workers dir. These are the
worker classes
I show being used here for proof of concept:


If you want to play with the demo app that implements those two
movies then
you can check out the rails app here to play with:

http://opensvn.csie.org/ezra/rails/plugins/backgroundrb/

If you want to have a named key instead of generated key you can
specify the
key yourself. This is handy for creating shared resources that more
then one
user will access so that multiple users and backends can get the same
object
by name.

MiddleMan.new_worker(:class => :foo_worker,
:args => “Bar”
:job_key => ‘shared_resource’)

For caching text or simple hashes or arrays or even rendered views you
can use a hash like syntax on MiddleMan:

MiddleMan[:cached_view] = render_to_string(:action => ‘complex_view’)

Then you can retrieve the cached rendered view just like a hash with:

MiddleMan[:cached_view]

You could create this cache and then have an ActiveRecord observer
expire the cache and create a new one when the data changes. Delete
the cached view with:

MiddleMan.delete_worker(:cached_view)

Best practice is to delete your job from the MiddleMan when you are
done with
it so it can be garbage collected. But if you don’t want to do this
then you
can use another option to clean up old jobs. Using cron or maybe
RailsCron
for a time, you can call the gc! method to delete jobs older then a
certain time.
Here is a ruby script you could run from cron that will delete all
workers
older then 24 hours.

#!/usr/bin/env ruby
require “drb”
DRb.start_service
MiddleMan = DRbObject.new(nil, “druby://localhost:22222”)
MiddleMan.gc!(Time.now - 606024)

** ROADMAP **

  1. Add better ActiveRecord caching facilities. Right now you can
    cache text, hashes, arrays and many
    other object types. But I am still working on the best way to
    cache ActiveRecord
    objects. I will probably use Marshal or YAML to do the right
    thing here
  2. More examples. A chat room is forthcoming as well as an email queue.
  3. More documentation.
  4. Detail how to set this up to work across physical servers. DNS
    must be good and have reverse
    dns as well for drb to work properly across machines.
  5. Profit… ?

#2

Fabulous work Ezra - I read you pre-announcement recently and you
didn’t disappoint!

When I read this I wonder if you’ve considered wrapping method calls
in transactions? (this might be an interested way to do distributed
work flow, maybe integrating with act_as_state_machine or some such).
I did something similar a couple years back with javaspaces and grid-
type processing.

very exciting!
Jodi

On 15-May-06, at 2:18 PM, Ezra Z. wrote:

data or launch jobs from.


#3

On May 15, 2006, at 12:07 PM, Jodi S. wrote:

Jodi
Jodi-

Thanks for the kind words. I think there are many ways to use this

plugin that will open new ways of creating different workflows
outside the typical request/response cycle. I wanted to get it out
there so I can get feedback and improve the functionality of this
micro-framework.

As far as wrapping method calls in transactions goes... I made the

requirements of making your own worker classes as transparent as
possible. So you could implement the transactions in your worker
classes if you wanted to. I think this could also be used as an
approach to messaging queues and a bunch of other stuff.

Anyway, I am totally open to ideas for where to go next with this. I

appreciate any and all feedback.

Cheers-
-Ezra


#4

Nice toy… I’ve been thinking on how to report progress of long running
tasks, and this is the solution :wink:

One nice addition would be that rails would automatically start the drb
server if it isn’t running yet, or restart it if it crashes for whatever
reason.


#5

Ezra,

this is really cool. In one of our projects we have implemented a
second process that does the “dirty work” (network monitoring) and
reports back to the Rails application. We use XML-RPC to communicate
between the two processes, but your approach seems (from the brief
look I gave it) to be much simpler.

I’ll take a deeper look soon

thanks
jc


#6

On May 15, 2006, at 1:18 PM, Ezra Z. wrote:

data or launch jobs from.

Okay, maybe I’m a dope but when I try this it fails. I’m running on
OSX 10.4.6, ruby 1.8.4 (from darwinports), rails 1.1.2.

When I load up the page (http://localhost:3000/drb) and click on the
link in the window, I get this output in log/development.log.

Processing DrbController#index (for 127.0.0.1 at 2006-05-15 16:39:14)
[POST]
Session ID: 2ad9d651642aecd661b2ee0dd30a7830
Parameters: {“action”=>“index”, “controller”=>“drb”}

DRb::DRbConnError (druby://localhost:22222 - #<Errno::ECONNREFUSED:
Connection refused - connect(2)>):
/opt/local/lib/ruby/1.8/drb/drb.rb:733:in open' /opt/local/lib/ruby/1.8/drb/drb.rb:726:inopen’
/opt/local/lib/ruby/1.8/drb/drb.rb:1186:in initialize' /opt/local/lib/ruby/1.8/drb/drb.rb:1166:inopen’
/opt/local/lib/ruby/1.8/drb/drb.rb:1082:in method_missing' /opt/local/lib/ruby/1.8/drb/drb.rb:1100:inwith_friend’
/opt/local/lib/ruby/1.8/drb/drb.rb:1081:in method_missing' /app/controllers/drb_controller.rb:5:inindex’
/opt/local/lib/ruby/gems/1.8/gems/actionpack-1.12.1/lib/
action_controller/base.rb:910:in perform_action_without_filters' /opt/local/lib/ruby/gems/1.8/gems/actionpack-1.12.1/lib/ action_controller/filters.rb:368:inperform_action_without_benchmark’
/opt/local/lib/ruby/gems/1.8/gems/actionpack-1.12.1/lib/
action_controller/benchmarking.rb:69:in perform_action_without_rescue' /opt/local/lib/ruby/1.8/benchmark.rb:293:inmeasure’
/opt/local/lib/ruby/gems/1.8/gems/actionpack-1.12.1/lib/
action_controller/benchmarking.rb:69:in perform_action_without_rescue' /opt/local/lib/ruby/gems/1.8/gems/actionpack-1.12.1/lib/ action_controller/rescue.rb:82:inperform_action’
/opt/local/lib/ruby/gems/1.8/gems/actionpack-1.12.1/lib/
action_controller/base.rb:381:in process_without_filters' /opt/local/lib/ruby/gems/1.8/gems/actionpack-1.12.1/lib/ action_controller/filters.rb:377:inprocess_without_session_management_support’
/opt/local/lib/ruby/gems/1.8/gems/actionpack-1.12.1/lib/
action_controller/session_management.rb:117:in process' /opt/local/lib/ruby/gems/1.8/gems/rails-1.1.2/lib/dispatcher.rb: 38:indispatch’
/opt/local/lib/ruby/gems/1.8/gems/rails-1.1.2/lib/
webrick_server.rb:115:in handle_dispatch' /opt/local/lib/ruby/gems/1.8/gems/rails-1.1.2/lib/ webrick_server.rb:81:inservice’
/opt/local/lib/ruby/1.8/webrick/httpserver.rb:104:in service' /opt/local/lib/ruby/1.8/webrick/httpserver.rb:65:inrun’
/opt/local/lib/ruby/1.8/webrick/server.rb:173:in start_thread' /opt/local/lib/ruby/1.8/webrick/server.rb:162:instart_thread’
/opt/local/lib/ruby/1.8/webrick/server.rb:95:in start' /opt/local/lib/ruby/1.8/webrick/server.rb:92:instart’
/opt/local/lib/ruby/1.8/webrick/server.rb:23:in start' /opt/local/lib/ruby/1.8/webrick/server.rb:82:instart’
/opt/local/lib/ruby/gems/1.8/gems/rails-1.1.2/lib/
webrick_server.rb:67:in dispatch' /opt/local/lib/ruby/gems/1.8/gems/rails-1.1.2/lib/commands/ servers/webrick.rb:59 /opt/local/lib/ruby/vendor_ruby/1.8/rubygems/custom_require.rb: 21:inrequire’
/opt/local/lib/ruby/gems/1.8/gems/activesupport-1.3.1/lib/
active_support/dependencies.rb:147:in require' /opt/local/lib/ruby/gems/1.8/gems/rails-1.1.2/lib/commands/ server.rb:30 /opt/local/lib/ruby/vendor_ruby/1.8/rubygems/custom_require.rb: 27:inrequire’
script/server:3

Rendering /opt/local/lib/ruby/gems/1.8/gems/actionpack-1.12.1/lib/
action_controller/templates/rescues/layout.rhtml (500 Internal Error)


I’m pretty sure this is because there isn’t anything listening on
port 22222. But I don’t see anything in the directions/instructions
on how or when to start a DRb server up on that port. Help?

cr


#7

On May 15, 2006, at 2:43 PM, removed_email_address@domain.invalid wrote:

backend processes like fcgi’s or mongrels. This MiddleMan runs in
yourself from here:
Processing DrbController#index (for 127.0.0.1 at 2006-05-15
16:39:14) [POST]
Session ID: 2ad9d651642aecd661b2ee0dd30a7830
Parameters: {“action”=>“index”, “controller”=>“drb”}

DRb::DRbConnError (druby://localhost:22222 - #<Errno::ECONNREFUSED:
Connection refused - connect(2)>):

Hey-

You need to start the drb server before you start rails. So in your

rails project run this command before you run script/server

$ script/backgroundrb/start -d

Then you can start rails with whatever server you want to use. You

actually caught a mistake I made in the README. I forgot to mention
how to start the drb server in the README and only had it in the
INSTALL doc. I will fix this right away.

To stop the drb server :

$ script/backgroundrb/stop

Cheers-
-Ezra


#8

On May 15, 2006, at 2:43 PM, removed_email_address@domain.invalid wrote:

backend processes like fcgi’s or mongrels. This MiddleMan runs in
yourself from here:

http://opensvn.csie.org/ezra/rails/plugins/backgroundrb/

Okay, maybe I’m a dope but when I try this it fails. I’m running on
OSX 10.4.6, ruby 1.8.4 (from darwinports), rails 1.1.2.

Here is the missing section on setting up the server. Its all on my
blog also:

http://brainspl.at/articles/2006/05/15/backgoundrb-initial-release

To install BackgrounDRb you need to follow these steps:

  1. Copy the backgroundrb folder to RAILS_ROOT/script/

  2. Make sure RAILS_ROOT/script/backgroundrb/start and RAILS_ROOT/
    script/backgroundrb/stop
    are executable and have the correct #! lines that will work with
    your system.

  3. Make a workers directory in RAILS_ROOT/lib/workers. This is where
    you will put your
    custom worker classes.

  4. Once BackgrounDRb is installed in your rails app you will have a
    start and stop
    script to use. You must start the drb server before you start rails
    To start the drb server you use this command from your RAILS_ROOT:

    to start

    $ script/backgroundrb/start -p 22222 -d

    That will start the drb server on port 22222 in the background
    and give you
    back control of the shell. Without the -d flag the drb server
    will run in
    the foreground which is helpful for debugging and experimentation.

    to stop

    $ script/backgroundrb/stop

    You will need to add a few lines to your RAILS_ROOT/config/
    environment.rb file.

    require “drb”
    DRb.start_service
    MiddleMan = DRbObject.new(nil, “druby://localhost:22222”)

    Make sure to set the port number to the one you started the drb
    server with.
    The port num defaults to 22222.

Cheers-
-Ezra


#9

On May 15, 2006, at 5:29 PM, Ezra Z. wrote:

tasks that allows for ajax progress bars and more. It also serves
http://brainspl.at/drb_ajax_tail.mov

Here is the missing section on setting up the server. Its all on my
blog also:

http://brainspl.at/articles/2006/05/15/backgoundrb-initial-release

To install BackgrounDRb you need to follow these steps:

Ezra,

these install instructions were not part of the distro that I
downloaded earlier today.

I did a “svn co http://opensvn.csie.org/ezra/rails/plugins/
backgroundrb/”. There isn’t any README or INSTALL to describe the
next actions to take.

I thought I was going insane when I saw other people posting to the
list with (apparently) successful attempts at running it. :slight_smile:

Thanks for creating such a nice product. I don’t have an immediate
use for it, but I will.

cr


#10

http://lists.rubyonrails.org/mailman/listinfo/rails
Sorry about the confusion. The readme and install text was in the
backgroundrb distribution but was not in the sample app. I have fixed
this problem now and the README is in the app and the normal install.

So to avoid confusion, here is the link to the standalone
backgroundrb files you can install in your app including the README:

http://opensvn.csie.org/ezra/rails/backgroundrb/

And here is a proof of concept standalone app that has the
progressbar example and the log file tail example.

http://opensvn.csie.org/ezra/rails/plugins/backgroundrb/

I will have a rubyforge project for this setup soon and will start a

small mailing list for discussion.

-Ezra


#11

Fantastic timing, Ezra!

I’m just at the point of having to consider building a non-browser UI
for one of my apps because of the time it takes to perform two
specific tasks (both high CPU & disc IO on the server); the browser
session times out too often.

This looks like it might be a life saver.

I’ve been following your postings on it for a while now, and I’m
really happy it’s come to fruition at this exact moment.

Now, about Rubuntu… ;->

Thanks and regards

Dave M.


#12

Dave T. vient d’annoncer la version finale de Rails Recipes.
Celle ci est chez l’imprimeur.
La distribution des éditions papier devrait donc commencer d’ici un mois
environ.


#13

Bonjour à tous!

c’était fort sympathique ce petit dîner convivial hier soir, Ã
recommencer.

Pour fêter ça je forwarde (pour ceux qui l’auraient raté sur la liste
rails)
l’annonce de la publication de BackgrounDRb qui a été évoqué Ã
différents
moments hier soir: c’est un projet d’Ezra qui permet de gérer les taches
de
longue durée (ex: un mailing en batch, un process d’analyse de la base
ou
autre…) de façon plutôt élégante et confortable… A creuser,
clairement.

a+

Thibaut

[blog] http://www.dotnetguru2.org/tbarrere

---------- Forwarded message ----------
From: Ezra Z. removed_email_address@domain.invalid
Date: May 15, 2006 8:18 PM
Subject: [Rails] [ANN] BackgrounDRb background task runner and
Application
Wide Context Store
To: removed_email_address@domain.invalid

Friends-
I’m happy to annouce the first alpa release of BackgrounDRb. This is a
small
framework for managing long running background tasks that allows for
ajax
progress bars and more. It also serves as an Application wide cache and
context store for when you need something like sessions but shared
between
users and multiple backend processes like fcgi’s or mongrels. This
MiddleMan
runs in a drb(distributed ruby) process that is separate from the rails
process. This allows for all backends to have one place to store data or
launch jobs from.

You can see the proof of concept screencasts here:


And you can download the proof of concept rails app and run it yourself
from
here:

http://opensvn.csie.org/ezra/rails/plugins/backgroundrb/

I’m looking for some folks to play with this and give feedback so I can
improve on it. So I appreciate any feedback from folks who try this out
for
me.

Cheers-
-Ezra

Here is the README:
BackgrounDRb is a small framework for divorcing long running tasks from
Rails request/response cycle. With HTTP it is usually not a very good
idea to keep a request waiting for a response for long running actions.
BackgrounDRb also allows for status updates that in combination with
ajax can render live progress bars in the browser while the background
worker task gets completed. The MiddleMan can also be used as a cache.
You can store rendered templates or compute intensive results in the
MiddleMan for use later.

The MiddleMan drb server is a front controller or factory/delegate type
of object. It takes instructions from you and instantiates your worker
objects with the args you send from rails. It uses a hash to keep a key
pointing to a running worker that is also put into the session in rails
so railscan find the same Worker object that is running its job on
subsequent requests. The MiddleMan front object has a method that takes
a class type as a symbol and instantiates a new instance of said
class type. Then it returns a job_key to the client(rails). Rails can
then call a method on that object through the MiddleMan class.

There are many possible use cases for this system. More will be
implemented
soon. This is an open request for comments and feature requests.

The great thing about this framework is the fact that it creates a
shared
resource that is accessible from multiple backends like fcgi’s or
mongrel
processes. They each get a connection to the same MiddleMan object so
this
can be used as an application wide context as well as background process
runner that is shared across all users and all backends.

Let’s look at how this system works in detail.

Look at INSTALL for instructions on how to get everything set up and
working.

Lets look at a simple worker class.

class FooWorker
include DRbUndumped

def initialize(options={})
@progress = 0
@results = []
@options = options
start_working
end

def start_working
# Work loop goes inside a new thread so it doesn’t block
# rails while it works. A neat way to do progress bars in
# the browser is to have a @progress instance var that is
# initialized to 0 and then gets bumped up by your long
# running task. This way you can poll for the progress
# of your job via ajax and update a client side progress bar.
Thread.new do
# main work loop goes here. do work and update the
# progress bar instance var.
while something
@results << foo(@options)
@progress += 1
break if @progress > 99
end
end
end

def results
@results
end

def progress
puts “Rails is fetching progress: #{@progress}”
@progress
end
end

Your worker classes go into the RAILS_ROOT/lib/workers/ directory.
You can then use your worker class in rails like this:

in a controller

start new worker and put the job_key into the session so you can

get the status of your job later.

def background_task
session[:job_key] = MiddleMan.new_worker(:class => :foo_worker,
:args => {:baz => ‘hello!’,
:qux
=> 'another arg!})
end

def task_progress
if request.xhr?
progress = MiddleMan.get_worker(session[:job_key]).progress
render :update do |page|
page.replace_html(‘progress’,

#{progress}% done

” +
”)
if progress == 100
page.redirect_to :action => ‘results’
end
end
else
redirect_to :action => ‘index’
end
end

def results
@results = MiddleMan.get_worker(session[:job_key]).results
MiddleMan.delete_worker(session[:job_key])
end

Please note that when you use new_worker it takes a hash as the
argument.
the :class part of the hash is required so MiddleMan knows which
worker class to instantiate. You can give it either an underscore
version like :foo_worker or normal like :FooWorker. Also the :args key
points to a value that will be given to your worker class when
initialized.
The following will start a FooWorker class with a text argument of “Bar”

session[:job_key] = MiddleMan.new_worker(:class => :foo_worker,
:args => “Bar”)

In the background_task view you can use periodically_call_remote
to ping the task_progress method to get the progress of your job and
update
the progress bar. Once progress is equal to 100(or whatever you want)
you
redirect to the results page to display the results of the worker task.

There are a few simple examples in the workers dir. These are the worker
classes
I show being used here for proof of concept:


If you want to play with the demo app that implements those two movies
then
you can check out the rails app here to play with:

http://opensvn.csie.org/ezra/rails/plugins/backgroundrb/

If you want to have a named key instead of generated key you can specify
the
key yourself. This is handy for creating shared resources that more then
one

user will access so that multiple users and backends can get the same
object
by name.

MiddleMan.new_worker(:class => :foo_worker,
:args => “Bar”
:job_key => ‘shared_resource’)

For caching text or simple hashes or arrays or even rendered views you
can use a hash like syntax on MiddleMan:

MiddleMan[:cached_view] = render_to_string(:action => ‘complex_view’)

Then you can retrieve the cached rendered view just like a hash with:

MiddleMan[:cached_view]

You could create this cache and then have an ActiveRecord observer
expire the cache and create a new one when the data changes. Delete
the cached view with:

MiddleMan.delete_worker(:cached_view)

Best practice is to delete your job from the MiddleMan when you are done
with
it so it can be garbage collected. But if you don’t want to do this then
you

can use another option to clean up old jobs. Using cron or maybe
RailsCron
for a time, you can call the gc! method to delete jobs older then a
certain
time.
Here is a ruby script you could run from cron that will delete all
workers
older then 24 hours.

#!/usr/bin/env ruby
require “drb”
DRb.start_service
MiddleMan = DRbObject.new(nil, “druby://localhost:22222”)
MiddleMan.gc!(Time.now - 606024)

** ROADMAP **

  1. Add better ActiveRecord caching facilities. Right now you can cache
    text,
    hashes, arrays and many
    other object types. But I am still working on the best way to cache
    ActiveRecord
    objects. I will probably use Marshal or YAML to do the right thing
    here
  2. More examples. A chat room is forthcoming as well as an email queue.
  3. More documentation.
  4. Detail how to set this up to work across physical servers. DNS must
    be
    good and have reverse
    dns as well for drb to work properly across machines.
  5. Profit… ?

#14

Mon préféré :

Deployment sucks !
… Capistrano swallows

:o)


#15

Le 16/05/06, jmbremoved_email_address@domain.invalid a écrit :

Dave T. vient d’annoncer la version finale de Rails Recipes.
Celle ci est chez l’imprimeur.

Go DHH yourself !
http://www.spreadshirt.com/shop.php?sid=31753&op=articles

(oui, ça n’a rien à voir)

-- Jean-François.