Bad performance of nginx with Tomcat vs. Apache with Tomcat

Sometime ago, I posted an message about Nginx performance when paired
with Tomcat.
We recently did extensive in-house testing of various workload against
Nginx with Tomcat vs Apache vs Tomcat.

Apache wins hands down.

Here’s the basic setup

  1. Nginx (2 proc/8192 connections) -> http/1.0 -> Tomcat (HTTP
    connector)
  2. Apache (512 prefork) -> AJP -> Tomcat (AJP)

Both KeepAlive off (we don’t use KeepAlive due to L4 switch)

The physical server is 2 core Intel Xeon, which is typical web server
config here.
We have three grinder 3.2 load generators.
We tested 4K and 20K Tomcat simple HTML file, 20K simple HTML with
intentional 10% 200ms
sleep in Tomcat serving (emulate slow DB query), etc.

Every single case, Apache wins by at least 10-15%.
Throughput and response time.
Nginx uses a bit less CPU cycles (10-20%), but it is not able drive
Tomcat to 100% CPU.

Here’s my take on this performance problem.

  1. Lack of AJP support, which is an optimized HTTP protocol
    First of all, this is a serious bottleneck.

    • AJP has much less communication overhead than HTTP
  2. Lack of HTTP KeepAlive support for proxy

    • Lack of AJP may be compensated with HTTP keepalive support since
      there are
      at least twice the number of TIME_WAIT sockets (connection
      establishment mean time
      is at least twice - three times slower than that of Apache)
  3. Lack of connection pooling

    • Ey-balancer makes things a bit easier, response times are
      stable, but still the same
      average TPS and response time.
  4. There seems to be a huge bug in connection management code

    Two mix of transactions: 20K HTML serving and 8K HTML with
    intentional 200ms delay in Tomcat logic

    With Apache, 20K HTML serving took 36 ms on average while 8K HTML
    took 258 ms
    With Nginx, 20K HTML serving took 600 ms on average while 8K HTML
    took 817 ms

    I really cannot explain these difference. Not even TCP connection
    overhead or lack of AJP.

My questions is “should I abandon nginx at this point”?
I know nginx is great proxy and static file server but I cannot prove
my point with Tomcat over and over again.

Thank you

Chang

I think it would be beneficial to show us your nginx config :slight_smile:

Regards,
Istvan

You can use native Connector.
Http11AprProtocol try again.

2009/9/3 Chang S. [email protected]

  1. Apache (512 prefork) → AJP → Tomcat (AJP)
    Every single case, Apache wins by at least 10-15%.
  • Ey-balancer makes things a bit easier, response times are stable, but
    With Nginx, 20K HTML serving took 600 ms on average while 8K HTML took

Chang

imcaptor

Istvan.
It didn’t really matter what config I used.

I tried all combinations of

worker_process (2-512)
worker_connections (512-16000)
accept_mutex (on/off)
tcp_nopush (on/off)
tcp_nodelay (on/off)
proxy_buffer* (various sizes)
and other proxy related parameters you can imagine.
The one I showed you has the best performance
The following showed the best performance across the board

worker_process 2; # since we have 2 core machine
worker_connections 16000;
accept_mutex off;
max_connections 256; # ey-balancer (tomcat had 512 threads)
everything else default

Thank you.

I see,
well I was able to reach about 50K req/s on a single node with nginx
with
tuning linux/tcp stack/nginx and i learned one thing:

measure instead of guess

(and as a side effect: debug instead of guess.)

So, if i were you i would start dstat(dstat -cgilpymn) on that host and
see
the different elements of you system, enable debug logging, even
stracing
nginx

This is the way I think.

Regards,
Istvan

imcaptor.

I will try native APR connector soon.
Thank you

Hello!

On Thu, Sep 03, 2009 at 01:50:44PM +0900, Chang S. wrote:

connector)
2. Apache (512 prefork) -> AJP -> Tomcat (AJP)

Both KeepAlive off (we don’t use KeepAlive due to L4 switch)

As long as you don’t care about concurrency > 512 and client
keepalive - Apache may be better (or at least easier) choise here.

Tomcat to 100% CPU.

  • Lack of AJP may be compensated with HTTP keepalive support since
    there are
    at least twice the number of TIME_WAIT sockets (connection
    establishment mean time
    is at least twice - three times slower than that of Apache)

I believe Tomcat has noticeble overhead in connection
establishment code (though never tested it myself), so lack of
backend keepalive support may be an issue in your case.

System overhead shouldn’t be noticable (it’s about 1 ms on usual
backend networks).

With Apache, 20K HTML serving took 36 ms on average while 8K HTML
took 258 ms
With Nginx, 20K HTML serving took 600 ms on average while 8K HTML
took 817 ms

I really cannot explain these difference. Not even TCP connection
overhead or lack of AJP.

I believe this time differences has something to do with how
Tomcat handles connection establishment / close.

Currently nginx will consider request complete only after
backend’s connection close, and if Tomcat waits a bit for some
reason before closing connection - this may lead to above numbers.

You may try to capture single request between nginx and Tomcat
with tcpdump to prove this.

Maxim D.

Hi, Istvan

It is not about 50K single node nginx throughput.
I have a standard TCP/IP tuning settings.

We can reach 70K throughput in some workloads.
THey ALL depends on workloads.

I have not given all the details so that why you are saying that but
we are measure and capturing every possible system resource under /proc.

dstat does not tell you everything.
We are currently using collectl and collectd, and captures everything
under /proc

This is a sample of nginx access log (proxy service time and nginx
service time is there)

[03/Sep/2009:12:19:02 +0900] 10.25.131.46 200 8254 gzip:-% conns:
199229 up_response_t:0.392 svc_t:0.831 “GET /index_think.jsp HTTP/1.1”
[03/Sep/2009:12:19:02 +0900] 10.25.131.48 200 20524 gzip:-% conns:
199622 up_response_t:0.150 svc_t:0.668 “GET /static/20k.jsp HTTP/1.1”

I don’t think we are in a position to debug deeper into nginx since we
need to move on.
Thanks

On Sep 4, 2009, at 7:44 AM, Maxim D. wrote:

Apache wins hands down.
keepalive - Apache may be better (or at least easier) choise here.
Right.
In a real world, where Tomcat uses most of CPU cycles and web servers
are tightly capacity planned for horizontal scalability, we should
not allow more than that in one dual-core web server.

We are simply doing stress test to find out max capacity of nginx
vs apache (CPU usage comparisons as well)

Tomcat to 100% CPU.

  • Lack of AJP may be compensated with HTTP keepalive support since
    backend networks).
    Maxim.
    1ms matters for us. Average response time for short HTML files ranges
    1ms. even less.

Currently nginx will consider request complete only after
backend’s connection close, and if Tomcat waits a bit for some
reason before closing connection - this may lead to above numbers.

You may try to capture single request between nginx and Tomcat
with tcpdump to prove this.

This is an excerpt from access log file

  • Lightly loaded nginx

up_response_t:0.002 svc_t:0.002 “GET /static/20k.jsp HTTP/1.1”
up_response_t:0.204 svc_t:0.204 “GET /index_think.jsp HTTP/1.1”

20k file takes 2ms, the intentional 200ms delay file took 204ms.

  • Heavily loaded nginx

up_response_t:0.427 svc_t:0.889 “GET /index_think.jsp HTTP/1.1”
up_response_t:0.155 svc_t:0.430 “GET /static/20k.jsp HTTP/1.1”

20k text file took 430 ms, and 200ms delay file took 889ms.

so upstream proxy processing took 155 ms (Tomcat) and nginx queuing
took 430ms for 20k HTML file.

I don’t think tcpdump reveals anything of other than packet exchanges;
delays of these magnitudes
happens in nginx

Maybe a bug in ey-balancer?
Don’t know.

I have been an nginx pusher in my company but since we are moving to
Tomcat setup,
I am not able to convince people to use nginx at this point.

Thanks, Maxim

Hello!

On Fri, Sep 04, 2009 at 08:48:57AM +0900, Chang S. wrote:

against
Both KeepAlive off (we don’t use KeepAlive due to L4 switch)
vs apache (CPU usage comparisons as well)
Am I right in the assumption that nginx and Tomcat runs on the
same host? This isn’t recommended since they will compete for
resources in some not really predictable way, but at least you
have to increase nginx workers priority in this case.

Tomcat to 100% CPU.

  • Lack of AJP may be compensated with HTTP keepalive support since
    backend networks).

Maxim.
1ms matters for us. Average response time for short HTML files ranges
1ms. even less.

Typical dynamic html generation will take much more, while static
should be served directly by nginx. So on usual setup benefit from
keepalive connections to backends isn’t really measurable.

Note that I do not try to convince anybody that keepalive to
backends isn’t really needed, just explaining why it’s not a high
priority task in development.

up_response_t:0.002 svc_t:0.002 “GET /static/20k.jsp HTTP/1.1”
up_response_t:0.204 svc_t:0.204 “GET /index_think.jsp HTTP/1.1”

20k file takes 2ms, the intentional 200ms delay file took 204ms.

These numbers looks pretty good. Basically nginx introduced no
delay, so my assumption about connection close problems was wrong.

I don’t think tcpdump reveals anything of other than packet exchanges;
delays of these magnitudes
happens in nginx

Maybe a bug in ey-balancer?
Don’t know.

EY balancer will queue requests that doesn’t fit into configured
limit, and queue time will be counted in $request_time (not
$upstream_response_time). If you use it in your test - it
explains the above numbers under load.

No idea how correct EY balancer is but anyway it will not be able to
fully
load number of backend workers identical to configured number of
max_connections. Since queue sits in nginx in this case, and there
is always some delay between requests from backend’s point of
view.

Unless you have huge number of backends capabale of processing
small number of concurrent connections you may want to try
standard round-robin balancer instead. Make sure you have
reasonable listen queue size on your backend.

Maxim D.

well we can argue on the tools you are using but the point is that you
have
an obvious performance drop somehow :slight_smile:

better to move to that direction to be able to find, fix it.

however i have no problem if you are using apache further, just don’t
see
the point to share your experience here if you don’t want to share the
config

regards,
Istvan

imcaptor.
Man, you are a lifesaver.

That changes everything. I build tomcat with apr native, and nginx
comes back alive and well!!

Nginx beats Apache by at least 10-20% and the response time better by
10-20%

I will have check other workloads as well, but nginx holds up pretty
well.

This is great, Thank you very much.

For those who uses Nginx - Tomcat setup, Http11Apr is A MUST!!

On Sep 4, 2009, at 6:52 PM, Maxim D. wrote:

In a real world, where Tomcat uses most of CPU cycles and web servers
are tightly capacity planned for horizontal scalability, we should
not allow more than that in one dual-core web server.

We are simply doing stress test to find out max capacity of nginx
vs apache (CPU usage comparisons as well)

Am I right in the assumption that nginx and Tomcat runs on the
same host? This isn’t recommended since they will compete for
resources in some not really predictable way, but at least you
have to increase nginx workers priority in this case.

All WAS (web application servers) are configured with Apache and Tomcat
on the same machine. Nginx and Tomcat should be configured on the same
server.

Nginx uses a bit less CPU cycles (10-20%), but it is not able drive

System overhead shouldn’t be noticable (it’s about 1 ms on usual
Note that I do not try to convince anybody that keepalive to
backends isn’t really needed, just explaining why it’s not a high
priority task in development.

Right, typical app server takes the most of user CPU cycles in WAS.
and yes, backend keepalive benefits are not measurable.

As I already have mentioned in the other mail, http11apr protocol
changes everything.
Now I have better-than apache performance.
Nginx rocks!!! You guys are the best.

intentional 200ms delay in Tomcat logic
Tomcat handles connection establishment / close.

  • Lightly loaded nginx

up_response_t:0.002 svc_t:0.002 “GET /static/20k.jsp HTTP/1.1”
up_response_t:0.204 svc_t:0.204 “GET /index_think.jsp HTTP/1.1”

20k file takes 2ms, the intentional 200ms delay file took 204ms.

These numbers looks pretty good. Basically nginx introduced no
delay, so my assumption about connection close problems was wrong.

Yes, above numbers are good.

$upstream_response_time). If you use it in your test - it
explains the above numbers under load.

reasonable listen queue size on your backend.

Maxim D.

Yes, the queuing time accounts against request_time.
Maxim. This workload represents a workload in which nginx should shine.
When some requests in app server or DB server takes more than usual,
web server thread / process should not hold up other short requests
like apache.
That was the original intention of the workload.
My expectation was that nginx would show lighted loaded case all the
way.

As you mentioned, since the queue sits in nginx and Tomcat is actually
holding up the requests.

These are all very synthetic and microbenchmark approach that does not
represents
real-world scenario. Just to show nginx vs apache performance
characteristics.

Since http11apr protocol allows nginx to show its full strength, I am
a happy feet again :wink:

Thanks for everything, Maxim. (and Igor, of course)

alright, i was just trying to tell you that sometimes you have to step a
bit
back and see the whole picture. this is all. so, finally the http11apr
was
the issue if understand well your last mail.

nginx ftw :slight_smile:

regards,
Istvan

I think I have already said everything about my config. nginx has very
simple config.
Here’s my config except all those variations of parameters I tried (ey-
balancer is off)

worker_processes 2;
events {
worker_connections 8192;
accept_mutex off;
}
http {
include mime.types;
default_type application/octet-stream;
access_log off;
keepalive_timeout 0;
server {
listen 80;
server_name localhost;

     location / {
         root   html;
         index  index.html index.htm;
     }

     location ~ \.jsp$ {
         root         /usr/local/tomcat5.5;
         proxy_pass   http://127.0.0.1:8080
     }
 }

}

I didn’t want to include the config file since I don’t think it
reveals anything wrong
and there were so many variations of them as I already said.

I just wanted to share my nginx with Tomcat performance experience to
see if anyone else
has the same problem or anyone else already have a solution.

Thank you for your reply.

On Sep 04, Chang S. wrote:

I will have check other workloads as well, but nginx holds up pretty well.

This is great, Thank you very much.

For those who uses Nginx - Tomcat setup, Http11Apr is A MUST!!

Also try NIO connector and make sure it uses the epoll implementation.