Rails app sloooooowing down

thewordnerd · September 13, 2010, 5:47pm

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

OK, I’m a bit out of my league with this one and am hoping there’s a
quick fix.

I’m running Redmine on JRuby on two separate hosts. One is rock solid
and I pretty much forget about it. The other gives me no end of grief.

In this situation, “grief” is defined as working reasonably OK for a few
hours, then slowing. “Slowing” means taking 30 seconds or more to
service requests–I’ve seen up to 250 in the logs–which of course times
out at the proxy. Looking at the logs, most of that is spent in the
view-rendering code, with a fraction spent in the database.

I don’t really know a whole lot about Java deployment, other than that
I’ve deployed several Lift apps on a few different server configurations
and it just seems to work. That isn’t a JRuby slam, just a statement
that I seem to know enough to be dangerous and not enough to actually
diagnose things when the various moving parts of Java deployment don’t
line up neatly.

Thoughts on what to check? The logs look fine, other than the huge
delays, and there’s no indication as to why things are slowing. My
server’s swap isn’t being hit at all, and we have 100 megs of RAM
unused. Commands seem responsive, suggesting that there isn’t a lot of
I/O load. All other services are running smoothly, but this one JRuby
instance is slowly taking more and more time to respond.

I’m currently trying a few long-cycle tests, running with different
versions of jruby-rack to see if that’s the issue. I’m currently on
1.0.3, and if the issue persists then I’ll downgrade to 1.0.1 as per
another suggestion for an unrelated issue. But while I’m waiting to see
if this has any effect, it’d be great to know if there is some obvious
gotcha that I’m missing, or if there’s something else that I might
check.

I also don’t like that Redmine is taking over a second to service
requests even at its idle state, but am hoping that the solution to one
problem is inherent in the other. I also tried running it in threadsafe
mode but received errors re: undefined variables, so I’m guessing that
Redmine isn’t threadsafe.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkyORwEACgkQIaMjFWMehWKRQQCfSOHB/3quTg5ZZdZGtjzsickR
MgkAnjVSnS5LD5JM8x/EOupuQab6eze7
=/gQL
-----END PGP SIGNATURE-----

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

thewordnerd · September 13, 2010, 10:43pm

On Mon, Sep 13, 2010 at 10:45 AM, Nolan D. [email protected]
wrote:

hours, then slowing. “Slowing” means taking 30 seconds or more to

Thoughts on what to check? The logs look fine, other than the huge
delays, and there’s no indication as to why things are slowing. My
server’s swap isn’t being hit at all, and we have 100 megs of RAM
unused. Commands seem responsive, suggesting that there isn’t a lot of
I/O load. All other services are running smoothly, but this one JRuby
instance is slowly taking more and more time to respond.

It’s hard to say exactly what could be causing it. Can I suggest
looking at VisualVM when the VM starts getting unresponsive? And
looking at the healthy app and comparing the two pictures? In
particular, look at the CPU, heap, and thread counts. (Start Visual VM
with the “jvisualvm” command. You should be able to connect to remote
JVMs with it as well.)

You might also generate a thread dump and see if there are a bunch of
hung/spinning threads eating up CPU but not doing anything. You’d know
to look at this if the thread trace in Visual VM shows an upward
trend.

/Nick

problem is inherent in the other. I also tried running it in threadsafe

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

thewordnerd · September 13, 2010, 10:52pm

Also make sure that even though it says you have free memory left that
you are not swapping. Swapping can easily give the performance you
describe

Some OS’s like to keep free memory around even when swapping and 100
MB does not sound like much left to me (especially if you consider a
big attachment to redmine issue could eat that away).

-Tom

On Mon, Sep 13, 2010 at 3:43 PM, Nick S. [email protected]
wrote:

In this situation, “grief” is defined as working reasonably OK for a few
line up neatly.
looking at the healthy app and comparing the two pictures? In

I also don’t like that Redmine is taking over a second to service
=/gQL

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

–
blog: http://blog.enebo.com twitter: tom_enebo
mail: [email protected]

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

thewordnerd · September 14, 2010, 12:19am

Hi,

to me, this sounds quite obvious like an out-of-heap-space problem. In
this case, GC runs most of the time; takes maybe 5 seconds for each
run. The Java process is running at 100%. You can diagnose this easily
by starting the process with “jruby -J-Xloggc:./gc.log” and take a look
in the log.

Good luck,
Heiko

On 13.09.2010 22:51, Thomas E Enebo wrote:

–
Heiko S.

Dr. Alfred-Neff-Str. 15
75015 Bretten

Telefon: +49 7252/97 54 07
Mobil: +49 179/836 60 10
Email: [email protected]

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

thewordnerd · September 14, 2010, 12:58am

In this situation, “grief” is defined as working reasonably OK for a few
hours, then slowing. “Slowing” means taking 30 seconds or more to
service requests–I’ve seen up to 250 in the logs–which of course times
out at the proxy. Looking at the logs, most of that is spent in the
view-rendering code, with a fraction spent in the database.

I assume you’re running it in production mode?
-r