Intermittent RingNotFound error

Am running a rails application which launches a series of relatively
long running scripts using delayed_job and Rinda to farm out the
work. At the moment am using a single delayed_job process (rake
jobs:work, as am stuck on Windows), a ring server process (started
with RingyDingy), and a single service process ( pretty ordinary )

These are running in the same machine (though we have plans to
actually have a ‘farm’ of services on more than one machine).

When we start a run of more than a few scripts ( longest run is about
48 scripts, run time up to 7 hours total) we occasionally see a
script fail because it can’t find the ring server (RingNotFound) error
even though the ring server process is running fine. The next script
almost always finds the server and runs ok.

Anyone have any ideas?

Code follows

worker excerpt: (where the error occurs)
###################################
def run_distributed
output = ‘urk’
@aFullScript.distrib = true

DRb.start_service( nil, @aFullScript )

DRb.start_service

ring_finger = Rinda::RingFinger.new('127.0.0.1')
sleep 2
ring_server = ring_finger.lookup_ring_any
sleep 2
log_message(1, "ring server:\n#{ring_server.inspect}", __LINE__)

service = ring_server.take([:name, :ScriptServer2, nil, nil])
log_message(1, "service:\n#{service.inspect}", __LINE__)

server = service[2]
server.fullScript = @aFullScript
log_message(1, "server:\n#{server.inspect}", __LINE__)

begin
  output = server.run
  ring_server.write([:name, service[1], service[2], service[3]])
rescue
  log_message(3, "In ScriptWorker2.perform #{$!}

\n#{@aFullScript.to_yaml}", LINE)
ensure
return output
end
end

ring_server.rb:
Dir.chdir(’…/vendor/gems/RingyDingy-1.2.1/lib’)
require ‘rubygems’
require ‘ringy_dingy/ring_server’

puts “ring server - waftt #{$$}”
rs = RingyDingy::RingServer.new(:Verbose => true)
rs.run

script_server2.rb:
require ‘rubygems’
require ‘rinda/ring’
require ‘drb’

class ScriptServer2
include DRbUndumped

attr_accessor :server_output
attr_accessor :fullScript

def initialize
@fullScript = ‘’
@server_output = ‘’
end

def run
@server_output = ‘’
puts “****** Running #{@fullScript.myName} #{Time.now.strftime(”%Y
%m%d %H%M%S")} (#{@fullScript})"
@server_output << @fullScript.doit

puts @server_output

puts "****** Completed #{@fullScript.myName}

#{Time.now.strftime("%Y%m%d %H%M%S")} (#{@fullScript})"
@server_output
end

end

DRb.start_service #( nil, ScriptServer2.new )
myPid = Process.pid.to_s
puts “ScriptServer2 #{myPid}”

finger = Rinda::RingFinger.new(‘127.0.0.1’)
ring_server = finger.lookup_ring_any
ring_server.write([:name,
:ScriptServer2,
ScriptServer2.new,
“ScriptServer2 #{myPid}”],
Rinda::SimpleRenewer.new)

liveTS = Time.now.strftime("%Y%m%d%H%M%S")
puts “going live… #{liveTS}”
DRb.thread.join

The command file that starts everything up:
start “rails server - waftt” /i /min ruby script/server -p 9191 start
sleep 6
start “ring server - waftt” /i /min ruby script/ring_server.rb
sleep 10
start “script server 1 - waftt” /i /min ruby lib/script_server2.rb
start “rake jobs:work 1 - waftt” /i /min rake jobs:work

Thanks!
pat

Update:
I needed to put a longer timeout in the call to
Rinda::RingFinger#lookup_ring_any.
After using 30 seconds instead of the default 5 seconds, the error
went away.
pat