Rails and background tasks/threads

ericrchr · February 14, 2006, 9:55pm

I am just getting into web servers/web applications and rails as well
so bear with me. I am trying to write a web app that, based on a
users input from the browser, will perform some task, and update the
browser (ala ajax style) as needed and/or provide a way for the user to
control the background task.

Now, I have more experience with java servlets, which makes this easy
enough for a beginner. The user inputs some info and the server spawns
a new thread to perform the task in the background while the user can
continue using the app, until it is updated with the results. (in my
case the background task is putting or getting messages on/from an MQ
Series queue) I keep a reference to the background task in the
session object for access as needed.

What I am wondering is if the same concepts can be applied to a rails
application. I started researching rails off and on recently and it
looks like this is mostly possible, but the vague area I am wondering
about is spawning a background thread and then maintaining a reference
to it so the user can get the data back and/or manipulate it in various
ways(start/stop/put to a different queue etc)

I saw RailsCron mentioned in another thread, but am not sure that is
exactly what I am looking for (maybe I’m wrong). Anyone with some
experience with this sort of scenario that can provide some guidance
would be greatly appreciated. I’ve asked the question a few times in
various forums with little to no response.

Thanks in advance,
Eric

ericrchr · February 14, 2006, 10:32pm

link_to_remote ?

ericrchr · February 14, 2006, 11:30pm

You could start a new thread and put it in the session, which they could
then stop or start.

Like:
session[:thread] = Thread.new{process}

Look at class Thread - RDoc Documentation &
http://www.ruby-doc.org/stdlib/libdoc/thread/rdoc/index.html
Or threads in the Ruby PicAxe book.

Joey

ericrchr · February 14, 2006, 11:42pm

On Feb 14, 2006, at 2:30 PM, joey__ wrote:

Joey

You will find that you cannot put a thread in the session. To put an
object in the session it needs to be serializable. This means no
threads, singletons or procs/lambdas can go in the session.

Cheers-
-Ezra Z.
Yakima Herald-Republic
WebMaster

509-577-7732
[email protected]

ericrchr · February 15, 2006, 1:59am

Bummer, didn’t see the last post.

So what’s a good solution to this problem? Seems like it would be a
common need, but then I’m kind of green to web programming.

Any suggestions?

ericrchr · February 14, 2006, 11:53pm

If it’s as simple as that I’ll be very happy.

Initially I’d just be using WEBrick (for internal purposes), does that
pose any problems?

For some reason I thought this was not possible, or practical. I’ll
have to try and remember why I thought so.

ericrchr · February 15, 2006, 5:17am

Ezra Z. wrote:

You will find that you cannot put a thread in the session. To put an
object in the session it needs to be serializable. This means no
threads, singletons or procs/lambdas can go in the session.

So you know, I’ve seen this information (and I think I’ve even told
someone this
information), but it’s just now ocurred to me that I should ask if
there’s an equivalent
of java’s “transient” in ruby? In java, making a variable transient
means that it won’t be
serialized.

If Ruby’s serialization pulls the whole graph, then there must be some
way to tell it to
skip certain children… or be very careful what you create references
to in your objects.

As for the OP’s question though, I would think you could store your
threads or procs in a
hash and put the key in the session.

b

ericrchr · February 15, 2006, 5:44am

Ben M. wrote:

As for the OP’s question though, I would think you could store your
threads or procs in a hash and put the key in the session.

Given that the session is a hash, how does that help? It just won’t
work. Without being able to serialize the threads, what does the hash
key reference? Bear in mind that the lookup might not happen on the
same machine, let alone in the same process…

ericrchr · February 15, 2006, 8:00am

On Feb 14, 2006, at 8:40 PM, Alex Y. wrote:

Alex

Rails mailing list
[email protected]
http://lists.rubyonrails.org/mailman/listinfo/rails

One way I handle longer running background processes in my rails

apps is to use distributed ruby(drb). Drb is dead simple to use so
its easy to make a drb server that contains the classes needed to do
the long running work. Then you can kick off the long running task to
the drb server from an ajax request. Then you can use a periodically
call remote to poll the drb server for its percent done on the job
you submitted. Once its 100% you can return the results. Doing it
this way divorces the long running task from the rails request/
response cycle all together and makes it easy to add progress bars to
the view for these long tasks. I have this functionality almost in a
state to refactor it into a plugin if any one was interested in it.

Cheers-
-Ezra Z.
WebMaster
Yakima Herald-Republic Newspaper
[email protected]
509-577-7732

ericrchr · February 15, 2006, 8:30am

I’ve been solving this by using Rinda (it’s in the pick-ax book and is
the
Ruby version of Linda which is a robust scalable method) so that RoR
posts
messages for things it wants done, and then “worker agents” process the
messages and post back another one giving updates or a “I’m done”
message.
If you investigate Rinda and are interested in taking this approach,
then I
can share some code that will save you a lot of time (the documentation
for
getting started on Rinda is a little sparse).

I’d love to hear how other people are solving the problem.
-Greg

ericrchr · February 15, 2006, 5:29am

From my experience, at least when running on Apache, putting work into
a background thread did not give me the results that I wanted. It
appears that the thread will suspend itself unless the server gets
activity. I ended up having to use a timer() and Ajax.Request to keep
polling for whether the thread was done to, among other things, keep
the thread running.

I had better luck forking off a heavyweight process to do the
background work. I’ve heard that the background processes work better
in LigHTTP.

ericrchr · February 15, 2006, 9:43am

Ezra Z. wrote:

refactor it into a plugin if any one was interested in it.

Ezra, you’re just a fount of useful information! I should have read this
before I replied
to Alex. And I would defintely like to hear about that plugin!

b

PS: I reaaaallllly think it’s important that mongrel (or something like
it) turn into a
servlet-container-style daemonized process into which one can hook and
unhook apps while
running, spin off background threads, pool db connections, etc. I don’t
think the java
world will take rails seriously without that. Then again, maybe drb can
grow into that, or
hook into mongrel, etc. etc.

ericrchr · February 15, 2006, 9:43am

Greg E. (other box) wrote:

It’d be cool to see that code… of course, I should probably learn
some more ruby and
get my rails training app into a respectable state first… but this
would be more fun!

b

ericrchr · February 15, 2006, 9:34am

Alex Y. wrote:

The key can just be a string, so the session is once again serializable.
You keep your
threads in a hash in memory. When the thread is needed again, the code
that handles this
looks for the appropriate session value to get it’s key, and then uses
that to get the
thread from the hash.

It’s basically just keeping the non-serializable stuff in your own hash
that won’t be
serialized.

Hmm, although, I guess this wouldn’t work without a daemon process in
which to keep the
hash… there is a way to do that in rails, right? Well, still… one
could start a
separate ruby process for this storage… would need to hook into that
from the rails code
somehow.

b

ericrchr · February 15, 2006, 10:19am

Ezra Z. wrote:

into a plugin if any one was interested in it.
I’d love to see that. I’m bodging that approach together piecewise at
the moment, and a plugin would make it much simpler. Rather than
long-running tasks, though, I’m actually talking to permanent daemons
that happen to live in the same object space, so the concept of ‘percent
done’ doesn’t really apply.

I’ve spotted where I would pluginize this, but I haven’t had the impetus
to fully work it through. If you’re closer to it than I am, please,
please release!

ericrchr · February 15, 2006, 10:13am

Ben M. wrote:

same machine, let alone in the same process…

The key can just be a string, so the session is once again serializable.
You keep your threads in a hash in memory. When the thread is needed
again, the code that handles this looks for the appropriate session
value to get it’s key, and then uses that to get the thread from the hash.

The session’s serializable, but that’s got you nothing because when you
unserialize it on the next request, what your key points to may well not
exist any more, or exist on another machine, or just generally not be
visible.

It’s basically just keeping the non-serializable stuff in your own hash
that won’t be serialized.
If that worked, you wouldn’t need a session at all, you could just
keep everything in memory all the time.

Hmm, although, I guess this wouldn’t work without a daemon process in
which to keep the hash… there is a way to do that in rails, right?
Well, still… one could start a separate ruby process for this
storage… would need to hook into that from the rails code somehow.
Yay! DRb!

ericrchr · February 15, 2006, 2:36pm

On Feb 15, 2006, at 2:29 AM, Greg E. (other box) wrote:

can share some code that will save you a lot of time (the
documentation for
getting started on Rinda is a little sparse).

I’d thank you if you did share. I’d like to use Rinda but keep
putting it off until I have more time (as if that’ll ever happen).

If you have a reliable way of starting/restarting the Rinda servers,
or any ideas, I think that’d be very well received too

Cheers,
Bob

I’d love to hear how other people are solving the problem.
-Greg

Bob H. – blogs at <http://www.recursive.ca/
hutch/>
Recursive Design Inc. – http://www.recursive.ca/
Raconteur – http://www.raconteur.info/
xampl for Ruby – http://rubyforge.org/projects/xampl/

ericrchr · February 15, 2006, 3:32pm

Wow, thanks for all the great responses. I think I need to plug away at
some more Rails and give the sugestions a try (rinda/drb). Any example
would be greatly appreciated.

I want to use Rails because there is so much about it I do like, but
without being able to tightly control background threads/tasks I really
cannot jump into it 100% - that’s the make or break for me.

ericrchr · February 15, 2006, 5:12pm

Greg-

I would love to see your rinda code. I have a great grasp of drb and

have been using it heavily but I haven’t really dug into rinda yet
and the docs are sparse. So please do share.

Thanks
-Ezra

On Feb 14, 2006, at 11:29 PM, Greg E. (other box) wrote:

can share some code that will save you a lot of time (the
[mailto:[email protected]] On Behalf Of Eric Ching

application. I started researching rails off and on recently and it
various forums with little to no response.

Rails mailing list
[email protected]
http://lists.rubyonrails.org/mailman/listinfo/rails

-Ezra Z.
WebMaster
Yakima Herald-Republic Newspaper
[email protected]
509-577-7732

ericrchr · February 15, 2006, 6:04pm

It sounds like RailsCron might be a good fit, as it’s a good way to
run rails background tasks in general. You might be able to get away
with a simple daemonized script/runner.

Assuming that you have figured out the best way to have a background
task running satisfactorily, I would write it as a method that traps a
signal, and whenever that signal is sent, it looks in the database (or
a flat file) to see what has changed. Alternately, you could poll the
database.

-Kyle