Questions about proxy logging and a sanity check

mike · April 20, 2008, 5:01am

I am attempting to replace LVS with nginx for load balancing.

Basically, one front-end load balancer (running nginx) to 3 backend
servers (soon to be running nginx + fastcgi/php)

Using these variables, it looks like it shows -all- the defined
upstreams it had to try.

$upstream_addr
$upstream_response_time
$upstream_status

I do realize it orders them differently. Is the first one listed
always the one that it used? i.e.

log_format main '$upstream_addr - $upstream_response_time -
$upstream_status

I looked to see if there was a variable that had the value of the
server that replied…

On a successful request, it shows a simple one line.
10.13.5.14:80 - 0.001 - 200

On a non-successful request (i.e. a 404) I see:
10.13.5.12:80, 10.13.5.14:80 - 0.000, 0.001 - 404, 404

Or, does it try each backend one at a time, and it just happens so
fast it looks like it’s done in parallel? and is it safe to say the
first one listed is the best option?

I plan to use this frontend nginx server for:

ssl (if i need it) - client<->frontend nginx (SSL). frontend nginx
<-> backend (non-SSL)
gzipping/compression - client<->frontend nginx (gzip). frontend
nginx <-> backend (uncompressed)
expires headers (perhaps) …

I am doing maybe 5 or 6 million requests per day right now. All of
them would be proxied through this frontend nginx server. nginx will
be running on a quad-core xeon 3220 w/ 2 gigs of ram available for
this. Would it be a problem?

mike · April 20, 2008, 5:16am

We do about 3m / day per server (Quad Core Apple Xserve), we use
around 4% cpu.

Cheers

Dave

mike · April 20, 2008, 11:04am

On Sat, Apr 19, 2008 at 07:51:39PM -0700, mike wrote:

$upstream_status
10.13.5.14:80 - 0.001 - 200

On a non-successful request (i.e. a 404) I see:
10.13.5.12:80, 10.13.5.14:80 - 0.000, 0.001 - 404, 404

Or, does it try each backend one at a time, and it just happens so
fast it looks like it’s done in parallel? and is it safe to say the
first one listed is the best option?

nginx tries them sequentially. As to 404, it seemd that you set

 proxy_next_upstream  ...  http_404 ...;

The $upstream_… variables show all tries.

I plan to use this frontend nginx server for:

ssl (if i need it) - client<->frontend nginx (SSL). frontend nginx
↔ backend (non-SSL)

gzipping/compression - client<->frontend nginx (gzip). frontend
nginx ↔ backend (uncompressed)

expires headers (perhaps) …

nginx can do all this.

I am doing maybe 5 or 6 million requests per day right now. All of
them would be proxied through this frontend nginx server. nginx will
be running on a quad-core xeon 3220 w/ 2 gigs of ram available for
this. Would it be a problem?

SSL and gzip are CPU hogs. I can not say about SSL.

Cluster that runs www.rambler.ru/images.rambler.ru/etc serves about
500-900 millions requests per day. Most of them are static files,
but about 22 millions are proxied and some of them are gzipped.

Cluster has several computers, but 2 computers with Athlon 64 X2 Dual
Core 4200+ / 4G are enough.

When computer handles about 7000 request/s, then network interrupt
handler
becames one of the CPU hog, but this is not your case.

mike · April 20, 2008, 11:40am

On 4/20/08, Igor S. [email protected] wrote:

nginx tries them sequentially. As to 404, it seemd that you set
proxy_next_upstream  ...  http_404 ...;

Correct:

proxy_next_upstream error timeout http_500 http_503 http_404
invalid_header;

(I figure this will allow for any machines that are out of date,
throwing bad gateways, or anything else invalid to be passed on)

log_format proxy ‘[$time_local] $remote_addr - “$request” - $status -
“$upstream_addr” - “$upstream_response_time” - “$upstream_status”’;

If I use this format, will the first server in $upstream_addr be the
one that actually replied? Is there any logic to it?

Thanks!

mike · April 20, 2008, 11:55am

Thanks!

I should be making sure that my backend/upstream servers have gzip off
and SSL off, since the nginx running as a load balancer will handle
that right?

I have defined gzip off; etc. on the upstream servers. On the nginx
proxy/load balancer server:

gzip on;
gzip_static on;
gzip_proxied any;
gzip_min_length 1100;
gzip_comp_level 2;
gzip_types text/plain text/html text/css application/x-javascript

text/xml application/xml application/xml+rss text/javascript;
gzip_disable “MSIE [1-6].”;
gzip_vary on;

This should ensure all content outbound that is text is gzipped right?

mike · April 20, 2008, 11:46am

On Sun, Apr 20, 2008 at 02:32:15AM -0700, mike wrote:

(I figure this will allow for any machines that are out of date,
throwing bad gateways, or anything else invalid to be passed on)

log_format proxy ‘[$time_local] $remote_addr - “$request” - $status -
“$upstream_addr” - “$upstream_response_time” - “$upstream_status”’;

If I use this format, will the first server in $upstream_addr be the
one that actually replied? Is there any logic to it?

No, the actual reply is from the last server.