Ruby dispatcher and work processes

Dobai-Pataky_BSSSSl · December 6, 2010, 4:06pm

I would like to implement an (open source) application platform
analogous to SAP’s ABAP platform in Ruby. (See below for link to SAP
architecture diagrams).

In part, this requires having different Ruby processes talk to each
other, specifically having one dispatcher process dishing out work to
standalone work processes.

I am at a loss though as to how to go about this. In part, one stumbling
block I see is that if each work process is a Ruby process, I might load
ruby code (files/programs) into memory in a work process, but I couldn’t
release it again. (Obviously I don’t want to incur the expense of
starting up Ruby for each incoming request).

Any ideas on what existing frameworks I could look at? I was
wondering about using MagLev, especially to take advantage of storing
data in shared memory between processes in an easy way. (i.e. each work
process would be a MagLev instance).

Any comments or suggestions would be welcome.

Links to SAP ABAP architecture information:

http://help.sap.com/saphelp_NW70EHP1core/helpdata/en/fc/eb2e8a358411d1829f0000e829fbfe/content.htm

http://help.sap.com/SAPhelp_nw70/helpdata/en/84/54953fc405330ee10000000a114084/content.htm

mjcharl · December 6, 2010, 4:47pm

On Mon, Dec 6, 2010 at 4:08 PM, Martin C. [email protected]
wrote:

I would like to implement an (open source) application platform
analogous to SAP’s ABAP platform in Ruby. (See below for link to SAP
architecture diagrams).

Maybe the announced fairy framework is for you.

In part, this requires having different Ruby processes talk to each
other, specifically having one dispatcher process dishing out work to
standalone work processes.

You could have a Queue instance and have several worker processes read
from it via DRb. That would be about the simplest scenario I can
think of.

I am at a loss though as to how to go about this. In part, one stumbling
block I see is that if each work process is a Ruby process, I might load
ruby code (files/programs) into memory in a work process, but I couldn’t
release it again. (Obviously I don’t want to incur the expense of
starting up Ruby for each incoming request).

You could terminate a worker process after it has processed a number
of requests or has run for a particular time thus balancing process
creation overhead and memory usage.

Any ideas on what existing frameworks I could look at? I was
wondering about using MagLev, especially to take advantage of storing
data in shared memory between processes in an easy way. (i.e. each work
process would be a MagLev instance).

Whether I would add the complexity of shared memory I would make
dependent on the amount of data that needs to be shared. If requests
and responses are small I would certainly not use shmem. Also, if the
data is read and stored elsewhere (e.g. RDBMS) I’d probably not bother
to use shmem.

Kind regards

robert

PS: Get well soon to your dog!

PPS: I see Pink does not need to wait any more.

mjcharl · December 6, 2010, 4:59pm

Robert K. wrote in post #966549:

On Mon, Dec 6, 2010 at 4:08 PM, Martin C. [email protected]
wrote:

I would like to implement an (open source) application platform
analogous to SAP’s ABAP platform in Ruby. (See below for link to SAP
architecture diagrams).

Maybe the announced fairy framework is for you.

Fairy framework?

In part, this requires having different Ruby processes talk to each
other, specifically having one dispatcher process dishing out work to
standalone work processes.

You could have a Queue instance and have several worker processes read
from it via DRb. That would be about the simplest scenario I can
think of.

SAP’s method of dispatching works on a “push” rather than a “pull” as
far as I can tell.

I am at a loss though as to how to go about this. In part, one stumbling
block I see is that if each work process is a Ruby process, I might load
ruby code (files/programs) into memory in a work process, but I couldn’t
release it again. (Obviously I don’t want to incur the expense of
starting up Ruby for each incoming request).

You could terminate a worker process after it has processed a number
of requests or has run for a particular time thus balancing process
creation overhead and memory usage.

Sounds complex, but I guess is worth looking into. Is there a way of
reloading a class instead? I.e. If all “programs” or “applications”
executed on the platform were implemented as a particular class, perhaps
reloading that class with new content could work?

Any ideas on what existing frameworks I could look at? I was
wondering about using MagLev, especially to take advantage of storing
data in shared memory between processes in an easy way. (i.e. each work
process would be a MagLev instance).

Whether I would add the complexity of shared memory I would make
dependent on the amount of data that needs to be shared. If requests
and responses are small I would certainly not use shmem. Also, if the
data is read and stored elsewhere (e.g. RDBMS) I’d probably not bother
to use shmem.

The advantage is that if one process handles your current request and a
different one your next, they need only attach to the shared memory
without unloading and reloading or the overhead of persisting to a
database between requests (or that is at least the thinking - SAP
obviously have perfected this over literally decades).

Kind regards

robert

Much appreciated, thanks.

PS: Get well soon to your dog!

PPS: I see Pink does not need to wait any more.

mjcharl · December 7, 2010, 12:24am

On Dec 6, 2010, at 10:59 AM, Martin C. wrote:

Maybe the announced fairy framework is for you.

Fairy framework?

http://code.google.com/p/fairy-prj/

Gary W.

mjcharl · December 7, 2010, 12:46am

On Mon, Dec 6, 2010 at 8:08 AM, Martin C. [email protected]
wrote:

Any ideas on what existing frameworks I could look at? I was
wondering about using MagLev, especially to take advantage of storing
data in shared memory between processes in an easy way. (i.e. each work
process would be a MagLev instance).

This may be a more general question about the ABAP architecture, but why
does it need both a database and shared state between workers?

This may just be an architectural preference on my part, but I prefer my
Ruby processes be shared nothing and totally stateless, with all state
stored in the database and only in the database.

If you really need shared state between workers, I’d suggest using JRuby
and
having each worker run as a separate thread. JRuby provides concurrent
execution of Ruby code without a global interpreter lock, and all
workers
share a heap.

IronRuby also supports this, as does the Rubinius “hydra” branch.

mjcharl · December 9, 2010, 5:29pm

Any ideas on what existing frameworks I could look at? I was
wondering about using MagLev, especially to take advantage of storing
data in shared memory between processes in an easy way. (i.e. each work
process would be a MagLev instance).

Any comments or suggestions would be welcome.

i think it’s already written for you:

gem install slave