Requests that make calls from XMLRPC::Client tie up Mongrel

Hi Everyone,

I’m running a Rails site on Mongrel and I can’t figure out why a
particular type of request ties up Mongrel easily. The requests that
tie up Mongrel call an XML-RPC server like so:

In the controller

def site_start
if params[:id]
@site = Site.find(params[:id])
@site.site_start
render :json=>{:success=>true}
end
end

In the model

def site_start
RpcTask.manage(self, ‘start’)
end

In RpcTask

def manage(site, task)
run(‘manage_task’, {
:site => site.name,
:site_id => site.id,
:task => task
})
end

which calls

def run(task, task_params = {})
begin
server = XMLRPC::Client.new2(‘http://localhost:9192/’)
result = server.call_async(task,task_params)
return result
rescue XMLRPC::FaultException => err
logger = ActiveRecord::Base.logger
logger.error(err.faultCode)
logger.error(err.faultString)
logger.error(result)
end
false
end

If I call the model method directly from the console, the RPC side
responds very quickly:

start = Time.now; Site.first.site_start; (Time.now - start).to_s
=> “0.493253”

Removing the body of RpcTask.run results in comparable performance.
Also, by switching from a single mongrel to a four-mongrel cluster, I
was able to get the these actions to perform acceptably but I imagine
that I’m doing something very wrong here to require so much power. Any
ideas?

Thanks,
Thomas A.

Do the RpcTask task methods end up using ruby threads to do their
work? That call_async method definitely sounds suspiciously like it
might.

I’ve found that ruby threads under mongrel (although I don’t think it’s
neccesarily an issue specific to mongrel) sometimes block when you don’t
think they ought to be, or end up in wait state for long periods when it
doesn’t seem like they ought to be.

When I have actual control over my ruby threads, I’ve found that
explicitly setting the thread priority of ‘background’ threads to be
lower than 0 generally frees things up. If RpcTask is creating threads
and you don’t want to hack it’s code to set thread priorities… is
there a synchronous method you can use instead of call_async to make
your rpc?

Jonathan

Hi Jonathan,

I thought that maybe using ‘call_async’ rather than simply ‘call’
might improve the situation but the behavior is the same with either
call.

Thomas A.

Cool, glad you solved it. Threading in ruby has all sorts of gotchas.
Although I can’t possibly explain why your new way works, it seems odd
to me, although it does seem to confirm that something about threading
is what was causing you problems. Out of curiosity, what was
RpcTask.run defined as before? Did it involve threads already?

Might be interesting to try your code under Passenger instead of Mongrel
and see if the same problem occurs. I am trying to switch all my stuff
out of mongrel to passenger, mongrel’s continued development seems…
not something I am confident in.

Jonathan

Thanks for suggesting that the problem may be related to threading. At
least on this Debian box, changing RpcTask.run to the following seems
to do the trick:

def run(task, task_params = {})
Thread.new {
server = XMLRPC::Client.new2(‘http://localhost:9192/’)
server.call(task,task_params)
}.value
end

Thomas

On Wed, Nov 18, 2009 at 2:13 PM, Thomas A.

Then again, this test (against a single mongrel started with the
production env) demonstrates that the site can handle a great number
of these logging requests:

% siege
http://localhost:9191/sites/log?site_id=1&msg=Memcached%3A+Starting&levelname=INFO
** SIEGE 2.68
** Preparing 10 concurrent users for battle.
The server is now under siege…
Lifting the server siege… done.
Transactions: 3362 hits
Availability: 100.00 %
Elapsed time: 19.64 secs
Data transferred: 0.15 MB
Response time: 0.06 secs
Transaction rate: 171.18 trans/sec
Throughput: 0.01 MB/sec
Concurrency: 9.99
Successful transactions: 3362
Failed transactions: 0
Longest transaction: 0.18
Shortest transaction: 0.03

Site.first.log_entries.length
=> 3372

The hanging requests I’ve observed have all been in the browser as
Ajax requests…but I officially have no idea as to what is causing
this issue.

Thomas

On Wed, Nov 18, 2009 at 4:16 PM, Thomas A.

I didn’t want to shoot off another email so soon, but it turns out
that code did not fix the problem. It seemed to because I was running
the mongrel on another port and this revealed the real problem: The
task being invoked on the other side logs certain information both to
a file and via HTTP (coming back to these same mongrel(s)). The
start_site task, for instance, would immediately send six requests to
the single mongrel, bogging it down (one request for each log message,
each one indicating a different service that had just been started).

So I have three options:

  • Re-work the other side so that it batches the log messages
  • Run enough mongrels to handle this sort of traffic (four seems to do
    it)
  • Tweak mongrel somehow…maybe that’s where you guys would have the
    best advice :^)

Thomas