Forum: Ruby Limit number of concurrent running threads in pool

Posted by Joe Martin (redlightg20)
on 2010-03-04 19:49
Hi
I created a pool of threads (say, 500 threads) to process.  However, due
to the weight of each thread, I want to limit the number of threads that
run concurrently.

So how would I go about putting a limit on the number of threads that
run at any given time?  I would like to take, say, 5 threads from the
pool and run them, and as each one completes, it is removed from the
pool and is replaced with a new thread from the pool.

Could this be done with a "spy" thread, in that it constantly loops to
check how many threads are running at once, and if the number of running
threads falls below the limit of 5, it takes the next thread out of the
pool and runs it?  Not sure how I would go about doing this, pretty new
to multithreading.

Thanks!
Posted by Roger Pack (rogerdpack)
on 2010-03-04 20:05
> I created a pool of threads (say, 500 threads) to process.  However, due
> to the weight of each thread, I want to limit the number of threads that
> run concurrently.
> 
> So how would I go about putting a limit on the number of threads that
> run at any given time?  I would like to take, say, 5 threads from the
> pool and run them, and as each one completes, it is removed from the
> pool and is replaced with a new thread from the pool.

http://github.com/spox/actionpool

might help.
-r
Posted by Joe Martin (redlightg20)
on 2010-03-04 23:13
Roger Pack wrote:
>> I created a pool of threads (say, 500 threads) to process.  However, due
>> to the weight of each thread, I want to limit the number of threads that
>> run concurrently.
>> 
>> So how would I go about putting a limit on the number of threads that
>> run at any given time?  I would like to take, say, 5 threads from the
>> pool and run them, and as each one completes, it is removed from the
>> pool and is replaced with a new thread from the pool.
> 
> http://github.com/spox/actionpool
> 
> might help.
> -r

Thank you very much, Roger.  I found this link 
(http://snippets.dzone.com/posts/show/3276) which it looks like you had 
a part in as well and just got that code working shortly after posting 
this thread.  But looking into ActionPool, it definitely offers expanded 
functionality so I will probably implement that solution instead.

Cheers!
Posted by Robert Klemme (Guest)
on 2010-03-05 13:26
(Received via mailing list)
2010/3/4 Joe Martin <jm202@yahoo.com>:
> check how many threads are running at once, and if the number of running
> threads falls below the limit of 5, it takes the next thread out of the
> pool and runs it?  Not sure how I would go about doing this, pretty new
> to multithreading.

Why do you create a pool much larger than the load you want to accept?
 Usually the pool size is used to limit concurrency.  Actually that is
the main purpose of thread pools.

If you have different tasks for which you want to have different
limits on concurrency you could also create several pools with
different sizes.

Kind regards

robert
Posted by Chuck Remes (cremes)
on 2010-03-05 14:00
(Received via mailing list)
On Mar 4, 2010, at 12:49 PM, Joe Martin wrote:

> Could this be done with a "spy" thread, in that it constantly loops to
> check how many threads are running at once, and if the number of running
> threads falls below the limit of 5, it takes the next thread out of the
> pool and runs it?  Not sure how I would go about doing this, pretty new
> to multithreading.

I've had very good success using the Threadz gem.

http://github.com/nanodeath/threadz

It's quite easy to understand and works very well with MRI and JRuby.

cr
Posted by Joe Martin (redlightg20)
on 2010-03-05 14:06
Robert Klemme wrote:
> 2010/3/4 Joe Martin <jm202@yahoo.com>:
>> check how many threads are running at once, and if the number of running
>> threads falls below the limit of 5, it takes the next thread out of the
>> pool and runs it? �Not sure how I would go about doing this, pretty new
>> to multithreading.
> 
> Why do you create a pool much larger than the load you want to accept?
>  Usually the pool size is used to limit concurrency.  Actually that is
> the main purpose of thread pools.
> 
> If you have different tasks for which you want to have different
> limits on concurrency you could also create several pools with
> different sizes.
> 
> Kind regards
> 
> robert

When running this program, I will provide a list of items that need 
processing.  In some cases, this list can be as long as 250 items, in 
other cases well over 50,000.  The processing of each item can take 
anywhere from 15 to 60 seconds per item, so you can see there is a 
benefit to multithreading here.  In processing each item, there are also 
a number of database calls that occur, so I would like to put a cap on 
the number of actively running threads to avoid overwhelming the 
database.  Am I going about this the wrong way?  Is there a more 
effecient more suitable way of doing this?
Posted by Robert Klemme (Guest)
on 2010-03-05 14:18
(Received via mailing list)
2010/3/5 Joe Martin <jm202@yahoo.com>:
>>
> other cases well over 50,000.  The processing of each item can take
> anywhere from 15 to 60 seconds per item, so you can see there is a
> benefit to multithreading here.  In processing each item, there are also
> a number of database calls that occur, so I would like to put a cap on
> the number of actively running threads to avoid overwhelming the
> database.  Am I going about this the wrong way?  Is there a more
> effecient more suitable way of doing this?

For this scenario a thread pool with fixed size seems sufficient.

queue = Queue.new # or bounded queue

def cont(item) !item.nil? end

threads = (1..10).map do
  Thread.new do
    while (cont(item = queue.deq))
       # .. process
    end
  end
end

queue.enc "Task"

threads.size.times do
  queue.enq nil # terminate
end

threads.each {|th| th.join}

As simple as that.

Kind regards

robert
Posted by Chuck Remes (cremes)
on 2010-03-05 14:49
(Received via mailing list)
On Mar 5, 2010, at 7:06 AM, Joe Martin wrote:

>> 
> other cases well over 50,000.  The processing of each item can take 
> anywhere from 15 to 60 seconds per item, so you can see there is a 
> benefit to multithreading here.  In processing each item, there are also 
> a number of database calls that occur, so I would like to put a cap on 
> the number of actively running threads to avoid overwhelming the 
> database.  Am I going about this the wrong way?  Is there a more 
> effecient more suitable way of doing this?

The Threadz gem let's you create a thread pool and then wait on its 
completion before you add more to it. This mechanism will help you cap 
the number of threads making database calls.

cr
Posted by Joe Martin (redlightg20)
on 2010-03-05 21:59
Robert Klemme wrote:
> For this scenario a thread pool with fixed size seems sufficient.
> 

Very good.  This works quite nicely as well.

Just wondering, are there any performance benefits of using one method 
over the other?
Posted by Caleb Clausen (Guest)
on 2010-03-05 22:59
(Received via mailing list)
On 3/5/10, Joe Martin <jm202@yahoo.com> wrote:
> Robert Klemme wrote:
>> For this scenario a thread pool with fixed size seems sufficient.
>>
>
> Very good.  This works quite nicely as well.
>
> Just wondering, are there any performance benefits of using one method
> over the other?

Yes. Threads use memory (quite a lot of it, in fact). Mostly this goes
to the thread's stack. Limiting the number of threads saves quite a
bit of memory. I'm not sure there's any improvement in the amount of
cpu time either way, other than perhaps some fewer cache misses
resulting from using less memory.
Posted by Robert Klemme (Guest)
on 2010-03-06 15:37
(Received via mailing list)
On 03/05/2010 09:59 PM, Joe Martin wrote:
> Robert Klemme wrote:
>> For this scenario a thread pool with fixed size seems sufficient.
>>
> 
> Very good.  This works quite nicely as well.
> 
> Just wondering, are there any performance benefits of using one method 
> over the other?

Which other method are you referring to?

Kind regards

  robert
Posted by Joe Martin (redlightg20)
on 2010-03-08 16:11
Caleb Clausen wrote:
> Yes. Threads use memory (quite a lot of it, in fact). Mostly this goes
> to the thread's stack. Limiting the number of threads saves quite a
> bit of memory. I'm not sure there's any improvement in the amount of
> cpu time either way, other than perhaps some fewer cache misses
> resulting from using less memory.

Thanks!  I figured this was the case.
Posted by Joe Martin (redlightg20)
on 2010-03-08 16:14
Robert Klemme wrote:
> Which other method are you referring to?

Well my original thinking was to create a thread for every single item I 
pass to the program.  Caleb answered this for me in that threads 
themselves are quite heavy, so it further leads me to go with your 
solution which creates a low number of threads and re-uses them until 
the queue is empty.  It seems much more simple and effecient than my 
original plan.
Posted by Robert Klemme (Guest)
on 2010-03-08 16:41
(Received via mailing list)
2010/3/8 Joe Martin <jm202@yahoo.com>:
> Robert Klemme wrote:
>> Which other method are you referring to?
>
> Well my original thinking was to create a thread for every single item I
> pass to the program.  Caleb answered this for me in that threads
> themselves are quite heavy, so it further leads me to go with your
> solution which creates a low number of threads and re-uses them until
> the queue is empty.  It seems much more simple and effecient than my
> original plan.

Ah, OK.  Then I was just confused because you had used the term
"thread pool" in your original posting.  That term is commonly use for
exactly the type of solution I posted (fixed or at least limited
number of threads which get their tasks from some form of queue).  You
probably weren't aware of this.

Kind regards

robert
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.