jobQueue 1.0.1 - Running stuff with a user defined number of threads

dubstep · December 12, 2011, 11:25am

Hi!
JobQueue is a ruby gem, that allows you to run Ruby methods and system
commands in parallel. It comes with an
executable prun.rb with takes two arguments: Number of worker threads
and a shell scripts, from which line is
executed in parallel wrt the given number of threads.

Thanks to Robert K., who provided some implementation ideas some
years ago.

Hints for improvements and any type of criticism are very welcome! Hope
you’ll find it usefull.

doc: http://rubydoc.info/gems/jobQueue/frames
dev: GitHub - Try2Code/jobQueue: Parallelize Ruby things and shell jobs on a defined number of threads
tests:
https://github.com/Try2Code/jobQueue/blob/master/test/test_jobqueue.rb

Ralf_M · December 12, 2011, 1:23pm

On Mon, Dec 12, 2011 at 11:25 AM, Ralf M. [email protected]
wrote:

Hi!
JobQueue is a ruby gem, that allows you to run Ruby methods and system
commands in parallel. It comes with an executable prun.rb with takes two
arguments: Number of worker threads and a shell scripts, from which line is
executed in parallel wrt the given number of threads.

Thanks to Robert K., who provided some implementation ideas some years
ago.

You’re welcome!

Hints for improvements and any type of criticism are very welcome! Hope
you’ll find it usefull.

doc: File: README — Documentation for jobQueue (1.0.11)

I would change the API slightly to modify behavior of #push:

push(obj, method, *arguments)
push(&block)

Example

send method

jq.push($stdout, :puts, “hello world”)

use block

jq.push do
puts “hello world”
end

One could even extend behavior by providing a back channel for results:

jq.push do |back_channel|
back_channel << (1 + complicated_calculation() * 123)
end

For that of course you must define how reply values are dealt with
(there could be a null back channel which just discards results if
configured that way). Alternatively however just the result values of
method and block invocation could be used. Maybe that’s cleaner.

Also I would separate support for the call of system: Basically
invoking system is a special case which does not necessarily have
something to do with job queues in general. So a better solution
would be to have a specialized job queue, e.g.

class SystemJobs
attr_reader :jq

def initialize(jq)
@jq = jq
end

def push(*args)
jq.push do |back_channel|
back_channel << system(*args)
# we could use a variant of IO.popen here as well which
# captures output
end
end
end

You then could still do the pretty short

sj.jq.push($stdout, :puts, “hello world”)

dev: GitHub - Try2Code/jobQueue: Parallelize Ruby things and shell jobs on a defined number of threads
tests:
jobQueue/test/test_jobqueue.rb at master · Try2Code/jobQueue · GitHub

Kind regards

robert

Ralf_M · December 12, 2011, 10:34pm

On Mon, Dec 12, 2011 at 5:18 PM, Ralf M. [email protected]
wrote:

under a different name.
Or you just add #push_all(enum).

Hm. It’s definitely good for testing. Could you image a “real” use case for
this? Maybe parallel image processing or database requests.

Well, any farmer worker scenario where you need a single instance
composing all results into something complete. Also, finding out that
all workers are finished could be viewed as one way of evaluating
results, too.

end
You then could still do the pretty short

sj.jq.push($stdout, :puts, “hello world”)

This would simplify the task handling in the JobQueue class. Running system
commands and calling ruby methods in the same queue should not be the
regular case.

My main argument would be separation of concerns. Your basic JobQueue
is simply only responsible for executing tasks in concurrent threads.
Executing system commands is a special case which would be of no use
for someone who just needs to concurrent execution in the current
process.

Many thanks!

You’re welcome!

Kind regards

robert

Ralf_M · December 13, 2011, 2:03am

Just curious, have you seen girl_friday?

http://mperham.github.com/girl_friday/

Ralf_M · December 14, 2011, 9:12am

On 12/13/2011 02:02 AM, Tony A. wrote:

Just curious, have you seen girl_friday?

http://mperham.github.com/girl_friday/
no, I didn’t. Seems to have a different interface concept, but looks
interesting. It seems to have a
back_channel and accepts blocks. But I cannot figure out, how to limit
the number of threads. That’s one of
the main features of jobQueue, because it allows you to control how mush
parallelism you really want.

Ralf_M · December 14, 2011, 9:30am

On 12/12/2011 10:31 PM, Robert K. wrote:

Or you just add #push_all(enum).
I skipped this for the moment, but the current 1.0.2 release has
implemented the strait forward interface you
suggested. I added a subclass if JobQueue for processing system
commands. A back_channel and blocks are still
to come.

regards
ralf

Ralf_M · December 12, 2011, 5:18pm

On 12/12/2011 01:23 PM, Robert K. wrote:

I would change the API slightly to modify behavior of #push:

push(obj, method, *arguments)
This was my first interface idea, too. But I skipped it to be able to
push a whole list of items at once. But
it really lacks readability - that’s for sure. I guess I will let the
user do the iteration over the items to
push in favour of having a beautiful interface. I could keep the old
push interface under a different name.

push(&block)
A very good point! I’ve neglected blocks totally (shame on me).
puts “hello world”
end
Will add these to my unit tests.
method and block invocation could be used. Maybe that’s cleaner.
Hm. It’s definitely good for testing. Could you image a “real” use case
for this? Maybe parallel image
processing or database requests.
def initialize(jq)
end

You then could still do the pretty short

sj.jq.push($stdout, :puts, “hello world”)
This would simplify the task handling in the JobQueue class. Running
system commands and calling ruby methods
in the same queue should not be the regular case.

Many thanks!
ralf