Hi,
I think I’m running up against ruby 1.8.6’s not so
stellar threading system. Was hoping someone
could confirm or otherwise point out some flaws.
Note: I get reasonable performance when running on
ruby 1.9 it’s just 1.8.6 that hangs like a
deadlock when I start using too many threads in
one of my test scripts. (My focus is actually
on 1.9 and jruby anyway).
Give you an idea:
I might get a pool of 10 acceptor threads to run
something like the following (each has their own
version of this code):
client, client_sockaddr = @socket.accept
# Threads block on #accept.
data = client.recvfrom( 40 )[0].chomp
@mutex.synchronize do
puts "#{Thread.current} received #{data}... "
end
client.close
All of this in one script. If I have so much as
2 requester threads in addition to the 10
acceptors waiting to receive their requests, 1.8.6
just seizes up before processing anything. If I
use 2 acceptors and 2 requesters, it works. If I
use 10 acceptors, 1 requester it works. When it
does work however, it doesn’t appear to schedule
threads too well; it just seems to use one all the
time - although this seems to happen only when
using sockets as opposed to a more general job
queue.
I haven’t submitted the full code because it uses
a threadpool library I’m still building/reviewing.
This won’t work. You can have only 1 acceptor thread per server socket.
Typically you dispatch processing after the accept to a thread
(either newly created or taken from a pool).
I have no idea what the interpreter is going to do if you have multiple
threads trying to accept from the same socket. In the best case #accept
is synchronized and only one thread gets to enter it. In worse
scenarios anything bad may happen.
I wanted to create a barrage of requests so next I
create a pool of requester threads which each run
something like this:
I haven’t submitted the full code because it uses
a threadpool library I’m still building/reviewing.
I would rather do something like this (sketeched):
require ‘thread’
queue = Queue.new
workers = (1…10).map do
Thread.new queue do |q|
until (cl = q.deq).equal? q
# process data from / for client cl
begin
data = cl.gets.chomp @mutex.synchronize do
puts “#{Thread.current} received #{data}…”
end
ensure
cl.close
end
end
end
end
This won’t work. You can have only 1 acceptor thread per server socket.
Typically you dispatch processing after the accept to a thread
(either newly created or taken from a pool).
I have no idea what the interpreter is going to do if you have multiple
threads trying to accept from the same socket. In the best case #accept
is synchronized and only one thread gets to enter it. In worse
scenarios anything bad may happen.
Ok, I wasn’t sure if it was appropriate having >1 thread per socket
instance. It appears to work ok on ruby 1.9 up to about 100 socket
connections - not that that means anything when it comes to testing
stuff with threads. Maybe if I do 100,000+ I might elicit some type of
error.
I was intending to process the result of accept in another pool but I
was toying with the idea of having 2-3 threads waiting on #accept
assuming no synchronisation issues. I didn’t know if it really mattered
or not. It might make a difference if you have a large number of
connections coming in depending on what the acceptor is doing in
addition; I wasn’t sure.
I guess I’ll have to scupper that idea or exhaustively test it to prove
it works and has benefit - both of which are questionable at this point.
I wanted to create a barrage of requests so next I
create a pool of requester threads which each run
something like this:
yeah I was going to, I was just going off some examples in the
documentation and trying to cut my teeth on them and writing some tests.
But I was heading that way.
queue.
See above.
I haven’t submitted the full code because it uses
a threadpool library I’m still building/reviewing.
I would rather do something like this (sketeched):
require ‘thread’
queue = Queue.new
workers = (1…10).map do
Thread.new queue do |q|
until (cl = q.deq).equal? q
# process data from / for client cl
begin
data = cl.gets.chomp @mutex.synchronize do
puts “#{Thread.current} received #{data}…”
end
ensure
cl.close
end
end
end
end
server = TCPServer.new …
while client = server.accept
queue.enq client
end
elsewhere
TCPSocket.open do |sock|
sock.puts “request”
end
Thanks for the example.
I am scratching my head a little with this line:
until (cl = q.deq).equal? q
Ok, I wasn’t sure if it was appropriate having >1 thread per socket
addition; I wasn’t sure.
I guess I’ll have to scupper that idea or exhaustively test it to prove
it works and has benefit - both of which are questionable at this point.
Frankly, I wouldn’t invest that effort: every example in all programming
languages I have seen has just a single acceptor thread. Accepting
socket connections is not an expensive operation so as long as you
refrain from further processing a single thread is completely sufficient
for handling accepts.
end
queue.enq client
I am scratching my head a little with this line:
until (cl = q.deq).equal? q
I’m familiar with Queue and its behaviour.
That’s the worker thread termination code which basically works by
checking whether the item fetched from the Queue is the Queue instance
itself. Actually I omitted the other half of the code (the place which
puts all those q instances in itself) because I didn’t want to make the
code more complex and also termination condition was unknown (may be a
signal, a number of handled connections etc.).
If you want to make termination more readable you can also do something
like this
QueueTermination = Object.new
…
until QueueTermination.equal?(cl = q.deq)
…
end
or
until QueueTermination == (cl = q.deq)
…
end
or
until QueueTermination === (cl = q.deq)
…
end
The basic idea is to stuff something in the queue which is unambiguously
identifiable as non work content.
Frankly, I wouldn’t invest that effort: every example in all programming
languages I have seen has just a single acceptor thread.
…or else serializes them so that only one thread accept()s at a time.
For a proper example look at Apache with preforked workers, and the
AcceptMutex directive. http://httpd.apache.org/docs/2.0/mod/mpm_common.html
You could try the same approach, and use a ruby Mutex to protect your
socket#accept - but that could turn out to be more expensive than having
a single accept thread which dispatches to your worker pool, if you’re
going to have a separate worker pool anyway.
Frankly, I wouldn’t invest that effort: every example in all programming
languages I have seen has just a single acceptor thread.
…or else serializes them so that only one thread accept()s at a time.
For a proper example look at Apache with preforked workers, and the
AcceptMutex directive. mpm_common - Apache HTTP Server
Cool. Didn’t even think to look at what the big boys do.
Thanks for the pointer.
You could try the same approach, and use a ruby Mutex to protect your
socket#accept - but that could turn out to be more expensive than having
a single accept thread which dispatches to your worker pool, if you’re
going to have a separate worker pool anyway.
Yeah, I have a worker pool. I was sort of extrapolating from that and
having an acceptor pool based around the socket in addition to the
worker pool.
I don’t have a lot of experience with heavy traffic; but the (naive)
motivation for this whole thing was to have one acceptor thread
receiving while the other was pushing on the queue and then swapping
over and over[1] – at least to allow people to experiment with that
sort of thing if they wanted to. But synchronisation issues with the
extra thread might make things worse. I’m used to trying out duff ideas
so heck maybe I might take a look at it at some point - if only to get a
better feel for what’s going on at that level.
Cheers,
Daniel B.
[1] actually, I naively wanted all the threads to block on the socket
just like they would on a queue. oh well.
Ok, I wasn’t sure if it was appropriate having >1 thread per socket
addition; I wasn’t sure.
I guess I’ll have to scupper that idea or exhaustively test it to prove
it works and has benefit - both of which are questionable at this point.
Frankly, I wouldn’t invest that effort: every example in all programming
languages I have seen has just a single acceptor thread. Accepting
socket connections is not an expensive operation so as long as you
refrain from further processing a single thread is completely sufficient
for handling accepts.
end
queue.enq client
I am scratching my head a little with this line:
until (cl = q.deq).equal? q
I’m familiar with Queue and its behaviour.
That’s the worker thread termination code which basically works by
checking whether the item fetched from the Queue is the Queue instance
itself. Actually I omitted the other half of the code (the place which
puts all those q instances in itself) because I didn’t want to make the
code more complex and also termination condition was unknown (may be a
signal, a number of handled connections etc.).
Ok, that’s cool. I was pushing termination jobs on the thing I was
playing with although what you’re doing there might be cleaner!
motivation for this whole thing was to have one acceptor thread
receiving while the other was pushing on the queue and then swapping
over and over[1]
You need to synchronize anyway (at least on the queue) so adding another
synchronization point (at accept) won’t gain you much I guess. As Brian
said, the effect can be the opposite - and nobody seems to do it anyway.
As said, accepting connections is a pretty cheap operation.
[1] actually, I naively wanted all the threads to block on the socket
just like they would on a queue. oh well.
You should also note that the network layer has its own queue at the
socket (you can control its size as well). So even if a single thread
would temporarily not be sufficient connection requests are not
necessarily rejected. Basically you have
I don’t have a lot of experience with heavy traffic; but the (naive)
motivation for this whole thing was to have one acceptor thread
receiving while the other was pushing on the queue and then swapping
over and over[1] – at least to allow people to experiment with that
sort of thing if they wanted to. But synchronisation issues with the
extra thread might make things worse. I’m used to trying out duff ideas
so heck maybe I might take a look at it at some point - if only to get a
better feel for what’s going on at that level.
You might look at an event framework like EventMachine or my own Rev ( http://rev.rubyforge.org/) as a less error prone and high performance
alternative to threads.
The disadvantage of this approach is the need to invert control (event
frameworks are asynchronous), however it will resolve the
synchronization
issues.
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.