Two Ruby threading questions

codeslinger · February 23, 2006, 6:40pm

Hi all,

I have a number of issues that I’m dealing with regarding Ruby threads
and I was wondering if someone might point me in the right direction
with some or all of them:

Is there a commensurate to Java’s setDaemon() functionality for
Ruby? I’ve looked around on the Web, but I can’t find reference to
anything and there’s nothing in the docs about it. For those not
familiar with Java, setDaemon() is equivalent to setting a thread to a
“detached” state in pthread lingo.
Here’s some demonstration code that will help understand this next
issue:

puma:~> cat a.rb
require ‘socket’

class X
def initialize
srv = TCPServer.new nil, 8008
Thread.abort_on_exception = true
Thread.start srv, &ServerMain
while true
# doing main thread stuff in here
sleep 10
end
end

def handle_sock_data data
# do something with data here
end

ServerMain = lambda do |srv|
begin
while true
s = srv.accept
Thread.start s, &ConnectionMain
end
rescue => e
puts “#{e.message}:\n\t#{e.backtrace.join(”\n\t")}"
ensure
srv.close
end
end

ConnectionMain = lambda do |sock|
begin
data = sock.gets
handle_sock_data data
rescue => e
puts “#{e.message}:\n\t#{e.backtrace.join(”\n\t")}"
raise e
ensure
sock.close rescue nil
end
end

end # class X

x = X.new
puma:~>

Here’s what I get when I run this, connect to it and send it a string:

puma:~> ruby a.rb
undefined method handle_sock_data' for X:Class: a.rb:34 a.rb:22 a.rb:7:ininitialize’
a.rb:45
a.rb:34: undefined method handle_sock_data' for X:Class (NoMethodError) from a.rb:22 from a.rb:7:ininitialize’
from a.rb:45
puma:~>

Since ServerMain and ConnectionMain are lambdas, I would have expected
them to be able to locate the handle_sock_data() method. Why can’t
they, and is there some way for me to get them to be able to see said
method?

Thanks in advance for any help you can provide.

codeslinger · February 23, 2006, 7:16pm

Nevermind about question 2 above. I figured it out. Defining the
lambdas in the previous manner put them in the class scope, not object
instance scope. Here’s how to make that work:

puma:~> cat a.rb
require ‘socket’

class X
def thread_init
@connection_main = lambda do |sock|
begin
data = sock.gets
handle_sock_data data
rescue => e
puts “#{e.message}:\n\t#{e.backtrace.join(”\n\t")}"
raise e
ensure
sock.close rescue nil
end
end
@server_main = lambda do |srv|
begin
while true
s = srv.accept
Thread.start s, &@connection_main
end
rescue => e
puts “#{e.message}:\n\t#{e.backtrace.join(”\n\t")}"
ensure
srv.close
end
end
end

def initialize
srv = TCPServer.new nil, 8008
Thread.abort_on_exception = true
thread_init
Thread.start srv, &@server_main
while true
# doing main thread stuff in here
sleep 5
puts Thread.list.join(’,’)
end
end

def handle_sock_data data
# do something with data here
puts “called handle_sock_data: ‘#{data.strip}’”
end

end # class X

x = X.new
puma:~>

This outputs:

puma:~> ruby a.rb
called handle_sock_data: ‘hello’
#Thread:0xb7d0fe3c,#Thread:0xb7d0fd10,#Thread:0xb7d1e748
called handle_sock_data: ‘goodbye’
#Thread:0xb7d0fe3c,#Thread:0xb7d1e748
#Thread:0xb7d0fe3c,#Thread:0xb7d1e748
#Thread:0xb7d0fe3c,#Thread:0xb7d1e748
a.rb:37:in sleep': Interrupt from a.rb:37:ininitialize’
from a.rb:49
puma:~>

I Ctrl-C’d it there at the end.

It would seem that I don’t need to worry about the Thread
detached/setDaemon functionality, either, since the client
@connection_main threads seem to get cleaned up without having to
explicitly join them. Arigatou gozaimasu, Matsumoto-san

codeslinger · February 23, 2006, 7:16pm

codeslinger wrote:

“detached” state in pthread lingo.
No. You will have to terminate them manually on exit.

Thread.start srv, &ServerMain
while true
    # doing main thread stuff in here

Don’t do that. Simply fork off a thread that does the connection
handling
and return. Initializers are for initialization but should not contain
the processing.

  while true
ConnectionMain = lambda do |sock|
    a.rb:22
them to be able to locate the handle_sock_data() method. Why can’t
they, and is there some way for me to get them to be able to see said
method?

Your lambdas are class level constants but the method is an instance
method. You rather want one lambda instance per onstance of X. But
blocks will do the job as well. Did you check the example for
TCPServer?
It’s below this:
http://www.ruby-doc.org/docs/ProgrammingRuby/html/lib_network.html#SOCKSSocket.close

require ‘socket’

class X
attr_reader :thread

def initialize
@thread = Thread.new( TCPServer.new( nil, 8008 ) ) do |server|
while ( sess = server.accept )
Thread.new(sess) do |session|
handle_data sess.gets
end
end
end
end

def handle(data)
…
end
end

x=X.new

do other stuff

puts “server started”

wait for termination

x.thread.join

Take care to synchronize access to shared resources in connection
handling.

Kind regards

robert

codeslinger · February 23, 2006, 8:14pm

Robert K. wrote:

codeslinger wrote:

Is there a commensurate to Java’s setDaemon() functionality for
Ruby? I’ve looked around on the Web, but I can’t find reference to
anything and there’s nothing in the docs about it. For those not
familiar with Java, setDaemon() is equivalent to setting a thread to a
“detached” state in pthread lingo.

No. You will have to terminate them manually on exit.

If this is true (see my last post), then this is a design flaw in
Ruby’s threading library. This particular application will be
long-running and cannot afford to have dead threads laying around
waiting for exit (which hopefully will never come). I understand that I
can join the threads in the main loop periodically to avoid this fate,
but that is still hackish. Perhaps a Thread#join_on_exit= or
Thread#autojoin= or Thread#detach= method would fit the bill for this?

the processing.
The above code was just an example. That’s not what the real code does
(its a port of some Java code), but since it is overwhelmingly large, I
wanted to show just the crux of the problem without all the other code
clouding the issue. My example wasn’t the best Ruby, as you’ve pointed
out

    end
  end
end

end

Yeah, I am doing something similar in the real code (see below), but I
have to hunt around for a port to listen on, so the exact code above
wouldn’t work for me. Plus, as I’ve said, the code inside the lambdas
is kind of large, and would cloud readability if I put it in the middle
of the initialize function. The real code reads:

def initialize datadir, namenodeaddr [...] thread_main_init ss = nil tmpport = 50010 machinename = DataNode.get_local_ip namenodeaddr[0] until ss begin ss = TCPServer.new machinename, tmpport LOG.info "Opened server at #{tmpport}" rescue IOError => e LOG.info "Could not open server at #{tmpport}, trying new port" tmpport += 1 end end @localname = "#{machinename}:#{tmpport}" @dataxceive_srv = Thread.start ss, &@data_xceive_server [...] end

The @data_xceive_server is one of the lambda’s that I am creating in
the thread_main_init() method (the others are for client connection
handling).

Take care to synchronize access to shared resources in connection
handling.

I got that. Thanks for all the help, Robert.

codeslinger · February 23, 2006, 10:58pm

[email protected] wrote:

On the other hand, threads are not blithely terminated when some
“master” thread exits. To complete your program, you would want to
signal each thread in an appropriate fashion so that it can finish
up what it’s doing and exit cleanly.

Aren’t other threads terminated?

$ ruby -e ‘Thread.new {sleep}; puts “done!”’
done!

But, as you say, cleanly stopping threads (perhaps using ensure blocks)
is a good idea:

$ ruby -e ‘Thread.new {begin; sleep; ensure puts “thread done”; end};
puts “done!”’
done!
thread done

codeslinger · February 23, 2006, 10:37pm

Quoting codeslinger [email protected]:

If this is true (see my last post), then this is a design flaw in
Ruby’s threading library. This particular application will be
long-running and cannot afford to have dead threads laying around
waiting for exit (which hopefully will never come).

If the dead threads aren’t referenced anywhere, the garbage
collector should clean them up. Completed threads don’t need to be
joined before destruction.

On the other hand, threads are not blithely terminated when some
“master” thread exits. To complete your program, you would want to
signal each thread in an appropriate fashion so that it can finish
up what it’s doing and exit cleanly.

-mental

codeslinger · February 24, 2006, 10:35am

codeslinger wrote:

If this is true (see my last post), then this is a design flaw in
Ruby’s threading library. This particular application will be
long-running and cannot afford to have dead threads laying around
waiting for exit (which hopefully will never come). I understand that
I can join the threads in the main loop periodically to avoid this
fate, but that is still hackish. Perhaps a Thread#join_on_exit= or
Thread#autojoin= or Thread#detach= method would fit the bill for this?

Joining won’t help because essentially this prevents the process from
terminating. I was wrong though (see Joel’s posting). Pickaxe II
states
in chapter 11 (page 137) that all threads are killed when the main
thread
exits. That’s why threaded programs usually end with a loop that joins
over all threads. Somehow this joining became second nature that I
completely overlooked that here. So to put things straight, in Java
lingo
all Ruby threads but the main thread are daemon threads.

I got that. Thanks for all the help, Robert.

You’re welcome - although I mislead you on the nature of threads. I’m
sorry for that.

The usual idiom I follow is to create all threads that must do their
work,
signal them for termination depending on the nature of the processing
done
(e.g. send a terminator message through a queue for a producer consumer
scenario) and have main thread join all threads before exiting.

What kind of application are you building?

Kind regards

robert