Forum: Ruby on Rails Production deployment speed "wakeup" issue

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
690c9bf8eeb80dd65c69609ed81da73c?d=identicon&s=25 Matt Jankowski (Guest)
on 2006-04-12 21:06
(Received via mailing list)
The deployment scenario...

Apache2 on shared host, proxying to lighttpd, which has 3 external
fcgis running on localhost.  The fcgis are managed by spinner/spawner.

We're noticing a definite speed issue on "first requests" to this site.

For example:
* Hit the site a few times, paying no attention to load time
* Wait x period of time (haven't quite narrowed this down yet, but
probably 5-10 mins)
* Hit site again once - this request will take anywhere from 5 - 30
or so seconds
* Reload site a few times - these requests will be very quick - less
than one second

These load times are reflected not just in the "feel" we get from
using the site, but are confirmed by the production.log

What's odd is that the time seems to be inconsistent with where it
happens.  The DB portion of the time is always very very small, even
on the "first request" long requests.  The overall completed time,
for example, might be 10 seconds.  Sometimes the 'Rendering'
component would be 6-7 seconds of that overall time, but sometimes it
will be very small (under 1 second) and the other 8-9 seconds that
aren't explained by either Rendering or DB time are lost
to.....something?

So, basically...

* Has anyone seen this issue before and know what the problem is?
* Are there settings in any of apache2, lighttpd or rails itself that
I'm unaware of which might cure this?

The app uses the Globalize plugin, but is otherwise pretty standard.
We've tried most combinations of switching between rails 1.0 and
rails 1.1.2, tweaking ActionController::Base.allow_concurrency (we
were also getting the "dropped mysql conn" errors in dev mode...),
tweaking ActionView::Base.cache_template_loading (thought that might
be slowing views down?), and so on all to no avail.

Thoughts?

-Matt
A0ed1bbfe42f4f87e6db0a16706246e2?d=identicon&s=25 Michael Greenly (mgreenly)
on 2006-04-12 22:12
Matt Jankowski wrote:

> Thoughts?
>
> -Matt

I ran accross this in Apache's proxy docs..


If you're using the ProxyBlock directive, hostnames' IP addresses are
looked up and cached during startup for later match test. This may take
a few seconds (or more) depending on the speed with which the hostname
lookups occur
Eea7ad39737b0dbf3de38874e0a6c7d8?d=identicon&s=25 Justin Forder (Guest)
on 2006-04-14 17:02
(Received via mailing list)
Matt Jankowski wrote:
> probably 5-10 mins)
> * Hit site again once - this request will take anywhere from 5 - 30 or
> so seconds
> * Reload site a few times - these requests will be very quick - less
> than one second

This has come up a number of times on the list. It may be that your
sleeping fcgi processes are swapped out, and take time to be brought
back to life. Various people have recommended using cron and wget (or
curl) to request a dynamic page every few minutes to keep response times
short.

> either Rendering or DB time are lost to.....something?
The slow rendering is more puzzling than the "missing time" - Rails
couldn't measure the time taken to swap a process back in.

regards

   Justin
807e34b31d5463a9ac05d41458a9e537?d=identicon&s=25 Al Evans (al-evans)
on 2006-04-15 14:28
Matt Jankowski wrote:
>
> Thoughts?

No thoughts, but here's a hack:-) I was seeing this problem (shared
host, running Apache/fcgi), and the occasional long connect times made
me think my fcgi dispatcher was getting swapped out. So I added this to
one of my controllers:

  def ping
    render :text => "<html><head></head><body>Ping!</body></html>"
  end

And I run this script on my desktop machine:

---
require 'open-uri'


def pingit(url)
  stuff = ''
  begin
    open(url) do |f|
    stuff = f.read
  end
  rescue Exception => e
    puts("#{e} #{e.to_s} in #{url}\n")
  end
  stuff
end


while true
  puts Time.new
  s = pingit(ARGV[0])
#  puts s
  sleep(600)
end

----

This way I can say "./Pingit.rb http://domain/controller/ping" and it
will hit the site every ten minutes, showing me any errors or failures
to connect. It seems to work fairly well -- site responds pretty
consistently in a second or two -- but this is a totally heuristic
approach.

--Al Evans
6f7c877de704c7cc03c8a3b2dc52df92?d=identicon&s=25 Carmen --- (carmen)
on 2006-04-16 05:39
> * Hit the site a few times, paying no attention to load time
> * Wait x period of time (haven't quite narrowed this down yet, but
> probably 5-10 mins)
> * Hit site again once - this request will take anywhere from 5 - 30
> or so seconds

i can top that, with lighttpd, don't hit the site for a few hours. then
the next request is a response 500. press F5, and then its fine. nothing
decidedly interesting in the logs other than the fastcgi process decided
to disappear.

i'm going to try mongrel when getting around to deploying..
690c9bf8eeb80dd65c69609ed81da73c?d=identicon&s=25 unknown (Guest)
on 2006-06-07 14:26
(Received via mailing list)
Just following up on my own post from a while back, with a report on how
the issue below resolved itself.

Biggest issues we found

* HUGE problem - the linux kernel which the machine was running was a
release from the 2.4 series which had big VM/swap issues.  This machine
-
which had been running a J2EE app with decent speeds and under very
little
load for the past year - was recently repurposed to do hosting for a few
rails applications.  I have no idea why the rails apps brought out the
demons that the J2EE app had not, but they did.  Moral of the story -
make
sure your kernel release is up to date.

* Remember to index your DB!  Maybe it's because I'm thinking in terms
of
models and not in terms of DB tables/rows, but I consistently forget to
add indexes to my tables while using migrations to create the DB.
Needless to say, going back in and indexing frequently used associations
provided a HUGE speedup for the application.

* Lighttpd / Apache issue - we found a strange condition with apache
proxying back to lighty where, on certain requests (usually asset files
-
js, css, images, etc) that were over ~20k in size, we'd get a lockup
between apache/lighty.  With a high timeout on the proxy, this leads to
a
scenario where the browser has the entire HTML page, but it's waiting on
some assets to render, and sits there until the proxy has timed out.
We've since switched to Apache2.2.2/mod_proxy_balancer/mongrel, and
aren't
particularly interested in tracking down what the actual issue here is.


So, in conclusion, the mongrel/apache/proxy_balancer setup (along with
rewrite rules in apache to serve static requests), is absolutely great,
and easy to manage with capistrano and mongrel cluster.  With the kernel
fix, the removal of lighty, and the db indexing, the application is much
much quicker and the "first page wait" issue is essentially gone.
This topic is locked and can not be replied to.