Efficient background process

I have a general question about a web service.

My app will (hopefully) parse RSS (XML) so that it returns newly updated
feeds.

The parsing job is not the problem. The problem is how to let the
server do it efficiently.

Obviously, one of the methods is the cron job. But it cannot process
all if the number of the feeds is too large. Somehow the server does
the job little by little.

For instance, I have some experience in AppEngine which provides Task
Queue that finishes the job little by little maintaining the server
intact.

Is there such functionality that Ruby on Rails has? so that my app can
finish parsing without reaching the server limit?

soichi

thanks! Resque and Sidliq sound great! I haven’t even heard about them
until now.

On Wed, May 29, 2013 at 9:14 PM, Soichi I. [email protected]
wrote:

Obviously, one of the methods is the cron job. But it cannot process
all if the number of the feeds is too large. Somehow the server does
the job little by little.

For instance, I have some experience in AppEngine which provides Task
Queue that finishes the job little by little maintaining the server
intact.

Is there such functionality that Ruby on Rails has? so that my app can
finish parsing without reaching the server limit?

If you want to queue projects why would you use Cron at all? And
Rails 4 does it’s called Queuing but you don’t need Rails to do that,
you could use Resque or Sidkiq which Queueing does but in with an
integrated API. And your apps limit is only defined by your servers
limit.

If you are talking about a system where a polling happens for jobs, and
then the processing of the jobs happens independently, I’d suggest
decoupling the two parts.

This is how I would go about it ( using AWS )

  • 1 EC2 instance to do the polling.
  • an SQS( a queue ) into which the polling EC2 pushes jobs.
  • An autoscaling EC2 group which scales based on the number of jobs in
    the
    queue and processes the jobs in the queue.
  • Emil