Forum: Rails deployment Decent banchmark results?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
8f977c0870d57e52e488dc3546fb7f24?d=identicon&s=25 roller8 (Guest)
on 2007-03-09 20:13
(Received via mailing list)
Hi again folks!  Everything is going really well since I last posted
and I'm very close to live deployment!

Anyway, I was benchamrking various Nginx + Mongrel cluster configs and
came up with what seems like my best performance for a regular Hello
Rails page.  I was hoping I could get some estimates so I can save the
time of having to try out other solutions (ie, Apache+Mongrel, etc,
etc).

Hardware:  Dual dual-core Xeon, 16gb ram, SCSI SAS mirror.

Best test results:  6 Nginx and 5 Mongrels were enough to meet this.
I tried more and less of both in different combos.  I got approx. 215
req/sec +/- over about 5 httperf's tests with these params:

httperf --server 127.0.0.1 --port 80 --uri /say/hello --rate 250 --num-
conn 10000 --num-call 1 --timeout 5

I am using Ezra's latest nginx.conf with the necessary modifications
to root directory and mongrel cluster block and it's all working fine.

I really just want to know if this sounds like a correct average.  I
realize I can do a lot more with this hardware so I plan to do some
virtualization as recommended earlier with a hardware load balancer up
front.

Thanks everyone!

Raul
539e19d44731cadf3180003d00ebef01?d=identicon&s=25 tmornini@engineyard.com (Guest)
on 2007-03-09 21:05
(Received via mailing list)
On Mar 9, 2:12 pm, "roller8" <roll...@gmail.com> wrote:

> Anyway, I was benchamrking various Nginx + Mongrel cluster configs and
> came up with what seems like my best performance for a regular Hello
> Rails page.

Could you explain what a "regular" Hello Rails page is? :-)

> Hardware:  Dual dual-core Xeon, 16gb ram, SCSI SAS mirror.

That's a lot of hardware.

> Best test results:  6 Nginx and 5 Mongrels were enough to meet this.
> I tried more and less of both in different combos.  I got approx. 215
> req/sec +/- over about 5 httperf's tests with these params:

Those numbers sound a low to me, actually. I'd expect at least 40 page
of session creation + rhtml render (no other DB activity) per core,
which
would be around 320/second.

Is the DB on the same box? Was the test running on the same box?

I'd recommend you run 8 nginx (1 per core), and 32 mongrels (4 per
core).

Also, 10,000 concurrent requests seems a bit high, but shouldn't
really
affect aggregate perforance.

--
-- Tom Mornini
Df5e7adb20adae6c120b04e7cafb15a0?d=identicon&s=25 Rob Sanheim (rsanheim)
on 2007-03-10 00:23
(Received via mailing list)
Hi Tom,


On 3/9/07, tmornini@engineyard.com <tmornini@gmail.com> wrote:
>
>
> Is the DB on the same box? Was the test running on the same box?
>
> I'd recommend you run 8 nginx (1 per core), and 32 mongrels (4 per
> core).
>

Is ~4 mongrels per core a normal baseline you start with for hardware
like that?  A site I'm working on will have a simliar setup -- 3 web
servers with dual dual-core xeons, though with less ram (4 to 8 gigs).
 This would be behind Apache, though, would that change the
recommendation of 4 per core?

I plan on doing plenty of tests with http-perf, of course, just
looking for a good starting point.

- Rob
38a02bf7121a81be5be6f3d488ce23b5?d=identicon&s=25 Alexey Verkhovsky (Guest)
on 2007-03-10 03:00
(Received via mailing list)
On 3/9/07, Rob Sanheim <rsanheim@gmail.com> wrote:
>
> > > Best test results:  6 Nginx and 5 Mongrels were enough to meet this.
> > > I tried more and less of both in different combos.  I got approx. 215
> > > req/sec +/- over about 5 httperf's tests with these params:
> >
> > Those numbers sound a low to me, actually.



Indeed. I can get over 700 dynamic "Hello, World" actions per second
with 3
Mongrels on a Dell D620 laptop. My definition of Hello, world is this:

  class TestsController < AppliucatrionController
    session :off
    def say_hi
      render :text => 'Hi!'
    end
  end


Note the "session :off" bit - this is very important.

Alex
539e19d44731cadf3180003d00ebef01?d=identicon&s=25 tmornini@engineyard.com (Guest)
on 2007-03-10 04:08
(Received via mailing list)
On Mar 9, 6:22 pm, "Rob Sanheim" <rsanh...@gmail.com> wrote:

> Is ~4 mongrels per core a normal baseline you start with for hardware
> like that?  A site I'm working on will have a simliar setup -- 3 web
> servers with dual dual-core xeons, though with less ram (4 to 8 gigs).
>  This would be behind Apache, though, would that change the
> recommendation of 4 per core?

There's a general understanding in Unix performance tuning that a load
of 4.0, which means that at any moment in time there are 4 processes
running or waiting to be scheduled (i.e. in the run queue), is
considered
saturation.

This is a *very* indirect measurement of exactly what is going on in a
system and assumes that those processes are CPU bound, and not
bound on other things such as network I/O, disk I/O, etc.

So, very simplistically speaking, a good place to start in just about
any
tuning project is 4 running processes per core. Again, this is an
*ultra*
simplistic way of looking at things.

In a typical Rails deployment scenario, the front-end web servers are
highly unlikely to break a sweat compared to the back-end application
servers. Since we're talking about such a rough measurement and only
a place to begin tuning at, I literally wouldn't even consider the
front
end processes into this equation.

If MySQL is running on the same box,  however, that would figure into
the equation.

Unix system performance tuning is a very complex subject, and it
changes all the time. That said, some of the best books written on the
subject were written a long time ago, when resources such as CPU
cycles, RAM, and disk storage were all scarce and expensive.

Here are two of the best books I've ever read on the subject, and
would highly recommend everyone interested in deployment read:

http://www.oreilly.com/catalog/spt/

I'd also recommend a book by Adrian Cocroft, which I believe was
called Solaris Performance Tuning. I'm *shocked* to find almost
zero references to that book in Google. It was a really great book.

If it's out of print, it makes me want to run to my bookshelf and
make sure it's still around, because it's a really great book. :-)

--
-- Tom Mornini, CTO
-- Engine Yard
539e19d44731cadf3180003d00ebef01?d=identicon&s=25 tmornini@engineyard.com (Guest)
on 2007-03-10 04:11
(Received via mailing list)
On Mar 9, 8:59 pm, "Alexey Verkhovsky" <alexey.verkhov...@gmail.com>
wrote:

>     session :off
>     def say_hi
>       render :text => 'Hi!'
>     end
>   end
>
> Note the "session :off" bit - this is very important.

What's the point of a test that has so little basis in real world
usage?

The idea is to measure realistic performance, not see how large a
number you
can generate!

Without a session creation, you might just as well have served a
static HTML
page, which would have returned higher numbers yet. :-)

--
-- Tom Mornini, CTO
-- Engine Yard
38a02bf7121a81be5be6f3d488ce23b5?d=identicon&s=25 Alexey Verkhovsky (Guest)
on 2007-03-10 04:25
(Received via mailing list)
On 3/9/07, tmornini@engineyard.com <tmornini@gmail.com> wrote:

> What's the point of a test that has so little basis in real world
> usage?


Are we talking about "hello world", or real world? :)

For real world (Mephisto with page caching off), the same laptop does
~40
req/sec.

Without a session creation, you might just as well have served a
> static HTML page, which would have returned higher numbers yet. :-)


True. And that was exactly what I wanted to establish with hello world
test
- that it's not much slower than serving static files.

Alex
361ba1bcc1d2c5a8885dd093dbb96bb6?d=identicon&s=25 Michael Kovacs (Guest)
on 2007-03-10 06:44
(Received via mailing list)
inline...

-Michael
http://javathehutt.blogspot.com

On Mar 9, 2007, at 7:07 PM, tmornini@engineyard.com wrote:

> There's a general understanding in Unix performance tuning that a load
> any
>
> If MySQL is running on the same box,  however, that would figure into
> the equation.
>

Assuming the scenario with everything running on one box can you
elaborate about how MySQL would figure into the equation? At the
moment it seems like maybe RAM would be the only factor because ruby
seems to be the CPU bottleneck by a long shot. I can't really even
get MySQL to blink. From my testing thusfar (granted not as extensive
as I'd like or as you and Ezra have no doubt performed) I don't see
how in a single box environment tuning anything other than the web
server process count and ruby/rails is going to make any perf diff.
Just curious if I'm way out of line with that thinking. I'm basing my
observation on my log file time division where 80-90%+ is spent in
ruby compared to 10% or less for MySQL.


> Unix system performance tuning is a very complex subject, and it
> changes all the time. That said, some of the best books written on the
> subject were written a long time ago, when resources such as CPU
> cycles, RAM, and disk storage were all scarce and expensive.
>
> Here are two of the best books I've ever read on the subject, and
> would highly recommend everyone interested in deployment read:
>
> http://www.oreilly.com/catalog/spt/
>
Thanks for the book recommendations... TIme to go see if it's in
safari so I can read it online :-)

>
Best,
-Michael
539e19d44731cadf3180003d00ebef01?d=identicon&s=25 tmornini@engineyard.com (Guest)
on 2007-03-10 07:05
(Received via mailing list)
On Mar 10, 12:42 am, Michael Kovacs <kov...@gmail.com> wrote:

> > If MySQL is running on the same box, however, that would figure into
> > the equation.
>
> Assuming the scenario with everything running on one box can you
> elaborate about how MySQL would figure into the equation? At the
> moment it seems like maybe RAM would be the only factor because ruby
> seems to be the CPU bottleneck by a long shot. I can't really even
> get MySQL to blink.

Are you using ActiveRecordStore for sessions, or using the default
disk
based sessions?

If you're writing to disk, it could well be that your disks are
limiting your
throughput, particularly if you are creating many thousands of files
in
the same directory, which can have nasty processor punishing
performance implications.

If you run top while the benchmark is running, what does the header
above
the process list look like?

> From my testing thusfar (granted not as extensive
> as I'd like or as you and Ezra have no doubt performed) I don't see
> how in a single box environment tuning anything other than the web
> server process count and ruby/rails is going to make any perf diff.
> Just curious if I'm way out of line with that thinking. I'm basing my
> observation on my log file time division where 80-90%+ is spent in
> ruby compared to 10% or less for MySQL.

No question you should focus the most on where you spend the most
time. But, *why* are you spending the time where you are?

You cannot generally tune anything. If you want really good results,
you have to get the tests as close to reality is possible. This is why
really good performance tuning guys must have access to production
systems. :-)

--
-- Tom Mornini, CTO
-- Engine Yard
539e19d44731cadf3180003d00ebef01?d=identicon&s=25 tmornini@engineyard.com (Guest)
on 2007-03-10 07:17
(Received via mailing list)
On Mar 9, 10:25 pm, "Alexey Verkhovsky" <alexey.verkhov...@gmail.com>
wrote:

> On 3/9/07, tmorn...@engineyard.com <tmorn...@gmail.com> wrote:
>
> > What's the point of a test that has so little basis in real world
> > usage?
>
> Are we talking about "hello world", or real world? :)

:-)

Good point, and a fair statement.

"hello world" isn't real world, that's true, but it isn't quite a
fairy tale either, which
is a little closer to what I'd call a sessionless "hello world."

> Without a session creation, you might just as well have served a
>
> > static HTML page, which would have returned higher numbers yet. :-)
>
> True. And that was exactly what I wanted to establish with hello world test
> - that it's not much slower than serving static files.

Well, that's a good testing and interesting in and of itself, but it's
not really fair
to post the results of that test as an example of how his results were
a bit lower
than expected.

And, not much slower than serving static files? You should be getting
closer to
4 kreq/sec for static files, perhaps even faster on a local system.

Here's a static test of a single Engine Yard slice, from another
slice, over
gigabit ethernet.

ey00-s00070 ~ # ab2 -n 5000 -c 4 http://www.engineyard.com/404.html

This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $>
apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/

Benchmarking 10.0.128.71 (be patient)
Completed 500 requests
Completed 1000 requests
Completed 1500 requests
Completed 2000 requests
Completed 2500 requests
Completed 3000 requests
Completed 3500 requests
Completed 4000 requests
Completed 4500 requests
Finished 5000 requests


Server Software:        nginx/0.4.13
Server Hostname:        10.0.128.71
Server Port:            80

Document Path:          /404.html
Document Length:        619 bytes

Concurrency Level:      4
Time taken for tests:   1.179527 seconds
Complete requests:      5000
Failed requests:        0
Write errors:           0
Total transferred:      4150000 bytes
HTML transferred:       3095000 bytes
Requests per second:    4238.99 [#/sec] (mean)
Time per request:       0.944 [ms] (mean)
Time per request:       0.236 [ms] (mean, across all concurrent
requests)
Transfer rate:          3435.28 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       3
Processing:     0    0   1.0      0      32
Waiting:        0    0   1.0      0      31
Total:          0    0   1.1      0      32

Percentage of the requests served within a certain time (ms)
  50%      0
  66%      0
  75%      0
  80%      0
  90%      1
  95%      1
  98%      1
  99%      1
 100%     32 (longest request)

--
-- Tom Mornini
38a02bf7121a81be5be6f3d488ce23b5?d=identicon&s=25 Alexey Verkhovsky (Guest)
on 2007-03-10 07:44
(Received via mailing list)
On 3/9/07, tmornini@engineyard.com <tmornini@gmail.com> wrote:
>
> You  should be getting closer to
> 4 kreq/sec for static files, perhaps even faster on a local system.


That's right, 3.9 kreq/sec it is. I had an error in the httpd.conf.
Thanks
for the heads-up.

Alex
8f977c0870d57e52e488dc3546fb7f24?d=identicon&s=25 Roller8 (Guest)
on 2007-03-12 05:08
(Received via mailing list)
Hmm, I'm a little confused about where I sit in this thread as it's
grown a
life of it's own!  :)  But I'm glad I get to read through the whole
conversation.  Lots of insight.

OK well my hello world is just a piece of the Instant Gratification
chapter
at the start of Agile Web Dev.  It's a single controller named "Say"
with
one action title "hello" and one rhtml template that has a little html
and
one call to Time.now.  Then I use htterf with the params I typed in the
original post.

My mySQL is sitting over on a different machine (Dell 2950, dual dual
core
Xdeon with 16gb ram, CentOS 4.4 on a SCSI SAS mirror and the MySQL db's
living on a local Raid 5 SCSI SAS set.  So no local DB action.  These
will
all be connected to 1GB switches.

So, if I'm benchmarking low are there any clues as to where I could
begin
looking?  Also, I realize that Nginx is supposed to serve up the static
content which is really fast.  So does that mean that rhtml pages with
ruby
code, html and images will get partially served by mongrels and Nginx or
is
it that once we're in an rhml template that it's 100% mongrels?

Hmm, a bit at a loss here.  I'll benchmark my static html again just to
be
sure but I'm pretty certain i scored low there too according to your
numbers
(~4000 req/secs).

Raul


----- Original Message -----
From: "tmornini@engineyard.com" <tmornini@gmail.com>
To: "Deploying Rails" <rubyonrails-deployment@googlegroups.com>
Sent: Friday, March 09, 2007 11:16 PM
Subject: [Rails-deploy] Re: Decent banchmark results?
539e19d44731cadf3180003d00ebef01?d=identicon&s=25 tmornini@engineyard.com (Guest)
on 2007-03-12 06:46
(Received via mailing list)
On Mar 12, 12:08 am, Roller8 <roll...@gmail.com> wrote:

> So, if I'm benchmarking low are there any clues as to where I could begin
> looking?

Just the ones already mentioned above. Have you adjust the number of
mongrels per machine, and are you using ActiveRecordStore for
sessions?

Don't forget that increasing the number of mongrels will require a
change to
nginx.conf...

> Also, I realize that Nginx is supposed to serve up the static
> content which is really fast.  So does that mean that rhtml pages with ruby
> code, html and images will get partially served by mongrels and Nginx or is
> it that once we're in an rhml template that it's 100% mongrels?

In your configuration it looks like nginx is involved in every
requests, and
handled static content all by itself, with no mongrel involvement.

For dynamic content, such as your benchmark, nginx takes the request,
but proxies back to mongrel for each request.

> Hmm, a bit at a loss here.  I'll benchmark my static html again just to be
> sure but I'm pretty certain i scored low there too according to your numbers
> (~4000 req/secs).

4k req/sec is for static content only. I'm sure you'd see similar if
not higher
number for your configuration.

--
-- Tom Mornini, CTO
-- Engine Yard
38a02bf7121a81be5be6f3d488ce23b5?d=identicon&s=25 Alexey Verkhovsky (Guest)
on 2007-03-12 17:47
(Received via mailing list)
What is you sessions storage config? If you didn't do anything
explicitly,
sessions are persisted as files in ./tmp/sessions, and that's slow.

Alex Verkhovsky
8f977c0870d57e52e488dc3546fb7f24?d=identicon&s=25 roller8 (Guest)
on 2007-03-12 18:38
(Received via mailing list)
OK, well I haven't reached that point yet but I've noted this.  Thanks
for the information.  Also, I think I may have had too many nginx
processes running for my machine.  I left 6 from Ezra's config but
I've moved it to 4, one for each processor for now.  Early benchmarks
show an improvement of about 260 req/sec but I think there's still
some tweaking that needs to be done before I even start tweaking the
app with caching and stuff.  I'm sure I'm missing something somewhere.

Tom, I'm also moving to 4 mongrels per CPU to see where that takes
me.  Oh, and lastly, I'm using the hugemem kernel (Linux
2.6.9-42.0.10.ELhugemem).  I wonder if I should try the regular smp
kernel for this?  I guess I will try it out anyway.

Raul


On Mar 12, 8:39 am, "Alexey Verkhovsky" <alexey.verkhov...@gmail.com>
38a02bf7121a81be5be6f3d488ce23b5?d=identicon&s=25 Alexey Verkhovsky (Guest)
on 2007-03-12 19:22
(Received via mailing list)
On 3/12/07, roller8 <roller8@gmail.com> wrote:
>
> some tweaking that needs to be done before I even start tweaking the
> app with caching and stuff.  I'm sure I'm missing something somewhere.


Well, if you haven't done anything with sessions, I would bet my 2 cents
that session persistence through file system is your current bottleneck.
At
least, it was for many people in a similar stage.

The typical price of it, if I remember correctly, is around 20 to 30
msec
per request, and it just keeps on growing.

Alex
539e19d44731cadf3180003d00ebef01?d=identicon&s=25 tmornini@engineyard.com (Guest)
on 2007-03-12 21:46
(Received via mailing list)
On Mar 12, 1:37 pm, "roller8" <roll...@gmail.com> wrote:

If reducing mongrels improved your score on that box, then it's almost
certainly using disk based sessions. Fewer mongrels means less disk
thrashing...

> Tom, I'm also moving to 4 mongrels per CPU to see where that takes
> me.

That's 4 mongrels per *core*.

> Oh, and lastly, I'm using the hugemem kernel (Linux
> 2.6.9-42.0.10.ELhugemem).  I wonder if I should try the regular smp
> kernel for this?  I guess I will try it out anyway.

You're going to get far better results by checking your session config
and reporting back to us...

If you switch to DB backed sessions with ActiveRecordStore, which
is very easy, you're likely going to see a major performance increase.

--
-- Tom Mornini, CTO
8f977c0870d57e52e488dc3546fb7f24?d=identicon&s=25 raul@roller8.com (Guest)
on 2007-03-12 22:42
(Received via mailing list)
Yes I think that was a definite problem.  I had tons of files in my tmp
sessions too that I've since cleared.  I'm now switched to
ActiveRecordStore
even though I don't yet have any DB access going on.  I'll be reporting
back
within minutes my new results.

Raul
8f977c0870d57e52e488dc3546fb7f24?d=identicon&s=25 roller8 (Guest)
on 2007-03-12 23:22
(Received via mailing list)
OK I'm now running 4 nginx and 16 mongrels (4 per core) and getting
420 to 450 req/sec.  So I think that may sound a little more proper
for the hardware I'm running then?
11ba162129ed5961b234b49f5c9ee624?d=identicon&s=25 Benjamin Ritcey (Guest)
on 2007-03-16 00:36
(Received via mailing list)
tmornini@engineyard.com wrote:
>
> I'd also recommend a book by Adrian Cocroft, which I believe was
> called Solaris Performance Tuning. I'm *shocked* to find almost
> zero references to that book in Google. It was a really great book.
>
> If it's out of print, it makes me want to run to my bookshelf and
> make sure it's still around, because it's a really great book. :-)
>

I'm guessing you're thinking of "Sun Performance and Tuning"?  It is
indeed a good book, but a bit dated.

http://www.amazon.com/Sun-Performance-Tuning-Java-...
539e19d44731cadf3180003d00ebef01?d=identicon&s=25 tmornini@engineyard.com (Guest)
on 2007-03-17 22:00
(Received via mailing list)
Yes! That's it exactly.

I'm surprised the Google didn't help me out on that one. :-)

Sorry for the mis-reference, everyone!
This topic is locked and can not be replied to.