My problem seems to be different from Ed’s –
At the moment I’m using nginx, with a configuration file based on the
one you cited below.
Let me describe what the server is doing:
The app is basically a collaborative office type environment for a
handful of people. At the moment I have, at maximum, 5 people logged on
simultaneously. Usually I have 2. It has instant messenging, so every
5 seconds it calls an action that completes in ~.0025 seconds and every
10 seconds calls a different action which completes in ~.005 seconds.
When there is only one person logged in, the problem crops up less
often, and it never happens when the server has been idle for a while,
only in the middle of sustained use.
If one is using the application, everythign seems very fast for a while,
and then, suddenly, a request will take 15 seconds or more to complete.
The strange thing about this, is that I can hit all of the servers
separately just fine while this request is stalled by pointing my
browser straight at them rather than going through the load-balancer on
port 80. Moreover, if I tail production.log (I am running in production
mode), I can see that stalled request takes no more time the usual to
complete, once mongrel sees it.
At first I thought that I had just written crappy code, and I spent a
bunch of time locating slow actions and speeding them up, making my
session smaller (I am using Stefan K.'s sql_session_store as my
container), and speeding up some of my DB queries, and this improved my
normal performance quite a bit, but it didn’t do anything to lower the
frequency of these stalls.
If I go down to 1 mongrel, performance is abysmal when people are logged
on and chat request/user list polling (described above) is happening,
but is perfectly reasonable otherwise. If I go up to 10 mongrels,
performance through the load balancer is worthless, with at least halfof
its requests stalling, but performance for any of the individual
mongrels is great, if I point to them directly.
I have plenty (280MB) of free RAM, and my mongrels all stabilize at
using about 30MB. According to top, my cpu is ~98% idle all of the
time.
that’s all I’ve got for now, thanks for listening,
Will
Ezra Z. wrote:
Are the two of you that are seeing this problem running in
production mode? And as you say that this happens with pound, it
might be a mongrel or rails issue as pound proxies everything making
mongrel serve static files too which it shouldn’t be doing in a
production environment. With nginx are you using a good config file
[1] that does the correct rewrites to make sure nginx serves all
static and rails page cached files?
Also what kind of server environment are you running on? Does the
site sit idle for a while before this happens? Maybe its being
swapped out to disk and then needs to be swapped back in? If you can
provide more details I’m sure that we can help you figure out what it
is.
-Ezra
[1] Ruby on Rails Blog / What is Ruby on Rails for?