Limit number of concurrent running threads in pool

Hi
I created a pool of threads (say, 500 threads) to process. However, due
to the weight of each thread, I want to limit the number of threads that
run concurrently.

So how would I go about putting a limit on the number of threads that
run at any given time? I would like to take, say, 5 threads from the
pool and run them, and as each one completes, it is removed from the
pool and is replaced with a new thread from the pool.

Could this be done with a “spy” thread, in that it constantly loops to
check how many threads are running at once, and if the number of running
threads falls below the limit of 5, it takes the next thread out of the
pool and runs it? Not sure how I would go about doing this, pretty new
to multithreading.

Thanks!

Roger P. wrote:

I created a pool of threads (say, 500 threads) to process. However, due
to the weight of each thread, I want to limit the number of threads that
run concurrently.

So how would I go about putting a limit on the number of threads that
run at any given time? I would like to take, say, 5 threads from the
pool and run them, and as each one completes, it is removed from the
pool and is replaced with a new thread from the pool.

GitHub - spox/actionpool: Easy thread pooling

might help.
-r

Thank you very much, Roger. I found this link
(http://snippets.dzone.com/posts/show/3276) which it looks like you had
a part in as well and just got that code working shortly after posting
this thread. But looking into ActionPool, it definitely offers expanded
functionality so I will probably implement that solution instead.

Cheers!

I created a pool of threads (say, 500 threads) to process. However, due
to the weight of each thread, I want to limit the number of threads that
run concurrently.

So how would I go about putting a limit on the number of threads that
run at any given time? I would like to take, say, 5 threads from the
pool and run them, and as each one completes, it is removed from the
pool and is replaced with a new thread from the pool.

might help.
-r

2010/3/4 Joe M. [email protected]:

check how many threads are running at once, and if the number of running
threads falls below the limit of 5, it takes the next thread out of the
pool and runs it? Not sure how I would go about doing this, pretty new
to multithreading.

Why do you create a pool much larger than the load you want to accept?
Usually the pool size is used to limit concurrency. Actually that is
the main purpose of thread pools.

If you have different tasks for which you want to have different
limits on concurrency you could also create several pools with
different sizes.

Kind regards

robert

On Mar 4, 2010, at 12:49 PM, Joe M. wrote:

Could this be done with a “spy” thread, in that it constantly loops to
check how many threads are running at once, and if the number of running
threads falls below the limit of 5, it takes the next thread out of the
pool and runs it? Not sure how I would go about doing this, pretty new
to multithreading.

I’ve had very good success using the Threadz gem.

It’s quite easy to understand and works very well with MRI and JRuby.

cr

2010/3/5 Joe M. [email protected]:

other cases well over 50,000. Â The processing of each item can take
anywhere from 15 to 60 seconds per item, so you can see there is a
benefit to multithreading here. Â In processing each item, there are also
a number of database calls that occur, so I would like to put a cap on
the number of actively running threads to avoid overwhelming the
database. Â Am I going about this the wrong way? Â Is there a more
effecient more suitable way of doing this?

For this scenario a thread pool with fixed size seems sufficient.

queue = Queue.new # or bounded queue

def cont(item) !item.nil? end

threads = (1…10).map do
Thread.new do
while (cont(item = queue.deq))
# … process
end
end
end

queue.enc “Task”

threads.size.times do
queue.enq nil # terminate
end

threads.each {|th| th.join}

As simple as that.

Kind regards

robert

Robert K. wrote:

2010/3/4 Joe M. [email protected]:

check how many threads are running at once, and if the number of running
threads falls below the limit of 5, it takes the next thread out of the
pool and runs it? �Not sure how I would go about doing this, pretty new
to multithreading.

Why do you create a pool much larger than the load you want to accept?
Usually the pool size is used to limit concurrency. Actually that is
the main purpose of thread pools.

If you have different tasks for which you want to have different
limits on concurrency you could also create several pools with
different sizes.

Kind regards

robert

When running this program, I will provide a list of items that need
processing. In some cases, this list can be as long as 250 items, in
other cases well over 50,000. The processing of each item can take
anywhere from 15 to 60 seconds per item, so you can see there is a
benefit to multithreading here. In processing each item, there are also
a number of database calls that occur, so I would like to put a cap on
the number of actively running threads to avoid overwhelming the
database. Am I going about this the wrong way? Is there a more
effecient more suitable way of doing this?

On Mar 5, 2010, at 7:06 AM, Joe M. wrote:

other cases well over 50,000. The processing of each item can take
anywhere from 15 to 60 seconds per item, so you can see there is a
benefit to multithreading here. In processing each item, there are also
a number of database calls that occur, so I would like to put a cap on
the number of actively running threads to avoid overwhelming the
database. Am I going about this the wrong way? Is there a more
effecient more suitable way of doing this?

The Threadz gem let’s you create a thread pool and then wait on its
completion before you add more to it. This mechanism will help you cap
the number of threads making database calls.

cr

On 3/5/10, Joe M. [email protected] wrote:

Robert K. wrote:

For this scenario a thread pool with fixed size seems sufficient.

Very good. This works quite nicely as well.

Just wondering, are there any performance benefits of using one method
over the other?

Yes. Threads use memory (quite a lot of it, in fact). Mostly this goes
to the thread’s stack. Limiting the number of threads saves quite a
bit of memory. I’m not sure there’s any improvement in the amount of
cpu time either way, other than perhaps some fewer cache misses
resulting from using less memory.

On 03/05/2010 09:59 PM, Joe M. wrote:

Robert K. wrote:

For this scenario a thread pool with fixed size seems sufficient.

Very good. This works quite nicely as well.

Just wondering, are there any performance benefits of using one method
over the other?

Which other method are you referring to?

Kind regards

robert

Robert K. wrote:

For this scenario a thread pool with fixed size seems sufficient.

Very good. This works quite nicely as well.

Just wondering, are there any performance benefits of using one method
over the other?

Robert K. wrote:

Which other method are you referring to?

Well my original thinking was to create a thread for every single item I
pass to the program. Caleb answered this for me in that threads
themselves are quite heavy, so it further leads me to go with your
solution which creates a low number of threads and re-uses them until
the queue is empty. It seems much more simple and effecient than my
original plan.

Caleb C. wrote:

Yes. Threads use memory (quite a lot of it, in fact). Mostly this goes
to the thread’s stack. Limiting the number of threads saves quite a
bit of memory. I’m not sure there’s any improvement in the amount of
cpu time either way, other than perhaps some fewer cache misses
resulting from using less memory.

Thanks! I figured this was the case.

2010/3/8 Joe M. [email protected]:

Robert K. wrote:

Which other method are you referring to?

Well my original thinking was to create a thread for every single item I
pass to the program. Caleb answered this for me in that threads
themselves are quite heavy, so it further leads me to go with your
solution which creates a low number of threads and re-uses them until
the queue is empty. It seems much more simple and effecient than my
original plan.

Ah, OK. Then I was just confused because you had used the term
“thread pool” in your original posting. That term is commonly use for
exactly the type of solution I posted (fixed or at least limited
number of threads which get their tasks from some form of queue). You
probably weren’t aware of this.

Kind regards

robert