Mod_rails fedora core

fdsa · April 17, 2008, 5:33pm

ive found a couple other posts with similar errors, but they turn out to
be permissions which doesnt seem to be the case here. any ideas?

first it does something like this :

[30140:Hooks.cpp:370] Processing HTTP request: /
[28940:ApplicationPoolClientServer.h:426] Client 0x2aaaaad68b30:
received message: [‘get’, ‘/opt/app’, ‘true’, ‘nobody’]
[28946:Hooks.cpp:370] Processing HTTP request: /
[28940:ApplicationPoolClientServer.h:426] Client 0x2aaaaad68b30:
received message: [‘get’, ‘/opt/app’, ‘true’, ‘nobody’]

then it starts doing this and the app is completely unresponsive :

[28940:Application.h:274] Application 0x2aaab5d47510: destroyed.
[28940:StandardApplicationPool.h:249] Cleaning idle app /opt/app (PID
22116)

any help would be great. followed the screenast ( as if that can be
screwed up ).

FC7 64-bit
rails 2.0.2

thanks in advance

…

fdsa · April 20, 2008, 12:00pm

On Apr 17, 5:33 pm, Fdsa F. [email protected]
wrote:

received message: [‘get’, ‘/opt/app’, ‘true’, ‘nobody’]
FC7 64-bit
rails 2.0.2

thanks in advance

…

Hi.

These are just debugging messages, and are not errors. It seems that
Apache is doing just fine. Have you already checked your Rails logs
though?

fdsa · April 28, 2008, 6:59pm

These are just debugging messages, and are not errors. It seems that
Apache is doing just fine. Have you already checked your Rails logs
though?

thanks for the reply.

i thought that might be the case.

basically, im load testing with apachebench and at some point ab and the
mod_rails server just stop communicating. no log data anywhere. ( no
tcpdump activity either on the test machine or the rails server )

i am able to open a browser on a third machine and hit the app.

strangely, i think i can also run another ab test on the original ab
testing server ( a second instance ) while the original ab test is
hanging and the second test does the same thing. runs for a bit. hangs.

so it kind of does and kind of doesnt seem like an apache issue?

fdsa · April 28, 2008, 7:28pm

strangely, i think i can also run another ab test on the original ab
testing server ( a second instance ) while the original ab test is
hanging and the second test does the same thing. runs for a bit. hangs.

also, something i dont get is :

./ab -c 100 -n 500 http://192.168.1.53
./ab -c 500 -n 500 http://192.168.1.53
./ab -c 10 -n 10000 http://192.168.1.53

all complete, however :

./ab -c 100 -n 1000 http://192.168.1.53

hangs.

an apache restart on the server will allow the tests to complete. so its
either apache or mod_rails?

ive worked the client settings in apache, but still no luck

theres absolutely no log messages, and even all tcpdump stops.

so any help would be great.

thanks
…

fdsa · April 29, 2008, 5:24pm

ok, so was able to get all the testing to finish.

i added :

net.ipv4.tcp_keepalive_time = 300
net.ipv4.tcp_max_orphans=1000

to sysctl. though apachebench completes at all levels of concurrency and
connections now, i still find this suspect because the same test to the
same machine using 12 mongrels instead of mod_rails works fine. as does
nginx / mongrel ( which doesnt seem to have a direct relationship to OS
settings here )

wierd.

also, there is no logging of any type anywhere on the system
indicating any issues. everything just stops. and an apache reload
allows the test to finish.

oh well

mod_rails is very slick stuff so far

…

fdsa · May 18, 2008, 2:28pm

Fdsa F. wrote:

ok, so was able to get all the testing to finish.

i added :

net.ipv4.tcp_keepalive_time = 300
net.ipv4.tcp_max_orphans=1000

I’m having similar problems - see Mod_rails kills my Apache - Rails - Ruby-Forum

but I haven’t solved it yet.

One thing you should check though is: Is Apache running mpm-worker or
mpm-prefork?

According to Hongli L., mod_rails only supports mpm-prefork at the
moment.

If you have found out anything else, I would very much like to know
about it.

Carsten

fdsa · June 6, 2008, 10:14pm

If you have found out anything else, I would very much like to know
about it.

i read your posts with hongli and those are similar but different. since
i simply couldnt get the reliability i needed with the proc tweaks etc.
i just upgraded the box(es) to a quad core with more ram and that seems
to have solved at least the

./ab -c 100 -n 1000 http://192.168.1.53

problem…for now.

interestingly, i came across this post last week or so :

http://poocs.net/2006/3/27/the-adventures-of-scaling-stage-3

and towards the end there this quote :

"Using tcpdump to monitor the traffic on the listener ports showed…
nothing. Not a single byte crossing the line. Using strace to check what
the â€œstuckâ€ listener is busy doing showed it sitting there in
â€œWaiting…â€ state. Also doing nothing.

Now the stunning part: If you restart lighttpd or the dispatcher things
start working again. In the end, this didnâ€™t indicate either side as
being responsible for the hang and we started looking elsewhere."

which is exactly the problem im having but completely architecturally
different and two years later. bizarre.

they also started meddling around in proc except their changes had
little to no effect from the sounds of things.

i knew about that mpm-prefork sorta, but since the default apache
install seemed to work, i havent yet concerned myself with it.

is there a release date for 1.1? im on 1.0.5, but havent been able to
reproduce the strange hanging which i assume is because of the quad core

actually, three quad core machines load balanced.

ill have more information available next month when we begin serious
load testing. were going live in august from java to rails and our site
is 125,000+ page views per day. the money is on passenger for now, but a
last second move to mongrel may be inevitable

well see

…