WEBrick on Windows

Hi all,

I’m trying to benchmark a few HTTP server types on Windows (XP Home,
specifically - don’t ask why), and I’ve hit a snag with this code:


require ‘xmlrpc/server’
require ‘xmlrpc/client’
require ‘benchmark’

class ServerObject < XMLRPC::Server
def initialize
super(8080)
@server.config[:AccessLog] = [[’’, ‘’]]
self.add_handler(‘benchmark.simple’) do
test()
end
end
def test
‘test’
end
end

test_obj = ServerObject.new
serving_thread = Thread.new{ test_obj.serve }

client = XMLRPC::Client.new(‘127.0.0.1’, ‘/’, ‘8080’)

n = 2000
Benchmark.bmbm(20) do |b|
b.report(‘Direct RPC’) { for i in 1…n;
client.call(‘benchmark.simple’); end }
end


The problem is that with n that high, I get an

c:/ruby/lib/ruby/1.8/net/http.rb:562:in initialize': Bad file descriptor - connect(2) (Errno::EBADF) from c:/ruby/lib/ruby/1.8/net/http.rb:562:inconnect’

from c:/ruby/lib/ruby/1.8/xmlrpc/client.rb:535:in `do_rpc’

error during the second round. Looking at netstat -a afterwards, I see
almost every local port in the range 1026-5000 in the TIME_WAIT state.
That’s a suspiciously round number, and I suspect there’s a
‘client_port_max=5000’ setting somewhere. That’s not what bothers me.
Why are these ports waiting, and how can I close them, or reduce their
timeout value? I’d rather not insert 30 second waits all over the place
if that’s enough of a delay…

Any tips? Moving to a different OS is not, unfortunately, an option,
although shifting up to XP Pro might be in a pinch.

2006/6/2, Alex Y. [email protected]:

end
client.call(‘benchmark.simple’); end }
from c:/ruby/lib/ruby/1.8/xmlrpc/client.rb:535:in `do_rpc’
although shifting up to XP Pro might be in a pinch.
I guess you were bitten by a limitation in the network handling of
your version of Windows. Only server operating systems are allowed a
high number of incoming connections. I don’t know whether there is a
tweak that will fix this but if not you will have to change OS or the
algorithm (doing sleeps in between for example).

Btw, why did you choose this subject?

Kind regards

robert

Robert K. wrote:

2006/6/2, Alex Y. [email protected]:

Hi all,

I’m trying to benchmark a few HTTP server types on Windows (XP Home,
specifically - don’t ask why), and I’ve hit a snag with this code:


>> error during the second round. Looking at netstat -a afterwards, I see > I guess you were bitten by a limitation in the network handling of > your version of Windows. Only server operating systems are allowed a > high number of incoming connections. That's fine. I don't need concurrent connections. The problem is, the connections are hanging around for far longer than I need them. Is there (likely to be) any way I can explicitly get rid of them, or is the delay just one of those annoying little 'features' that MS are using to try to convince me to spend more money?

I don’t know whether there is a
tweak that will fix this but if not you will have to change OS or the
algorithm (doing sleeps in between for example).
Bah. Not what I was after, but I guess if there’s no other way round
it…

Btw, why did you choose this subject?
If you mean the email subject line, I figured that other people out
there must be trying to benchmark WEBrick against other HTTP server
types (mongrel, et al), but not necessarily as a back-end to XMLRPC.
The problem doesn’t seem like it would be limited to XMLRPC to me.

Francis C. wrote:

Are you sure your connections are being closed after each call to
client.call(…)?
A quick hunt through the Net::HTTP code seems to indicate so. If I’m
reading it right, there’s a call to Net::HTTP::Post#end_request which
closes the socket client-side.

You may be hitting a per-process limit on open
descriptors. Are you running netstat while your process is running or after
it ends with the EBADF error? If the latter, then try catching EBADF and
put
in a long sleep, then look at netstat on a different shell. Localhost
connections don’t usually need to spend much time in the TIME_WAIT state.
They are still alive, but vanish as soon as the process dies, which
would seem to indicate that either the call to close the socket is
wrong, or it’s not being respected.


Alex

So while your process is still running, the connections are in
TIME_WAIT,
and then they vanish as soon as your process dies? That means they are
being
closed as you expect. A closed TCP connection spends a little time in
the
TIME_WAIT state, but as I said above, on a localhost connection it’s
usually
very short or zero. Try your program on an actual network connection to
another machine. When your process ends, the TIME_WAITing connections
should
NOT vanish, but will stick around for a short period of time (usually
not as
long on Windows as on Unix).

It that happens, it’s a sign that you should probably rethink your
design.

Are you sure your connections are being closed after each call to
client.call(…)? You may be hitting a per-process limit on open
descriptors. Are you running netstat while your process is running or
after
it ends with the EBADF error? If the latter, then try catching EBADF and
put
in a long sleep, then look at netstat on a different shell. Localhost
connections don’t usually need to spend much time in the TIME_WAIT
state.

Francis C. wrote:

not as
long on Windows as on Unix).
I’m not entirely sure I understand. If the sockets being in the
TIME_WAIT state is a sign that they’ve been closed normally, why am I
running out of resources? Surely there isn’t a hard limit on the number
of sockets a process can ever open?

It that happens, it’s a sign that you should probably rethink your design.
See, that’s the problem. I’ve only put a very thin layer around stuff
that’s in the standard library, using code that’s directly inspired by
examples in the docs. If that’s wrong, I’m not sure where I should
begin to analyse it or attempt to improve it.


Alex

Most kernels have hard and soft limits on the number of connections that
can
be in an any state, including TIME_WAIT, but I don’t know what those
limits
are in XP Home. Have you tried making 1000 calls, then waiting for a
minute,
then making 1000 more, and so on? I’m proposing that as an experiment,
not
as a solution. I would expect that you’ll see the TIME_WAITing
connections
drain out during the sleep interval. I’m still a little surprised that
you’re seeing this with localhost connections though. By definition, the
kernel knows the location of all the network segments in a localhost
connection so it shouldn’t need to spend much time in TIME_WAIT.

What’s the value of TCPTimedWaitDelay in your registry?