I’m working on a configuration for an nginx proxy that splits requests
between two upstream servers. The main reason I’m using a proxy is for
SSL
termination and for redundancy between the two upstream servers. Each
upstream server is just a simple nginx server with identical media files
stored on each.
The largest media file requested is around 2.5 megabytes. There are
files
larger than that, but they are requested in byte-ranges from our CDN.
I’m wondering how I should configure proxy buffering here. I noticed
that
the default is set to on: http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_buffering.
But, I’m not sure the default values for the buffer sizes and whatnot
are
ideal. I’ve had proxy_buffering turned on for quite a while now, and
have
noticed that the proxy_temp directory has over 1 GB of data in it. To my
understanding, this folder is used when the in-memory buffers cannot
hold
all of the data from the upstream, so it is written to disk.
If I set proxy_buffering to off, does that mean that the proxy
streams
the data directly to the client without buffering anything? Essentially,
that would mean that an nginx worker would be “busy” on both the
upstream
and proxy server for the entire duration of the request, correct?
If I keep it on, does it make sense to change the buffer sizes so
that
the entire response from the upstream can fit into memory? I assume that
would speed up the responses so that nothing is written to disk (slow).
From
my novice perspective, it seems counter intuitive to essentially read a
file
from upstream disk, write it to proxy disk, and then read it from proxy
disk
again.
What is a common use case for using proxy_buffering? Since it’s a
default
option, I assume it’s commonly used and for good reason. I’m just having
a
hard time applying the thought process to my specific setup.
One thing I thought of is that proxy_buffering is ideal if you have slow
clients - where downloading the media files could take a long time. In
this
case, the goal would be to free up upstream workers. However, since my
upstream is NOT an application server, and just nginx, is that really
necessary?
Only thing I can think of there is that it could be bad to keep all
those
“slow” connections open when reading the response from disk. If there
are
100 clients connection for different media files, and they are all
downloading very slow, maybe it would be a negative performance impact
on
the storage servers to be reading all that at once. But with
proxy_buffering
turned on, I assume that the entire response is read from disk RIGHT
AWAY,
and then stored in the buffer on the proxy. But if the proxy is just
writing
that response back to disk, doesn’t really matter much does it
— Original message —
From: “Maxim D.” [email protected]
Date: 3 September 2013, 17:58:00
The largest media file requested is around 2.5 megabytes. There are files
again.
What is a common use case for using proxy_buffering? Since it’s a default
option, I assume it’s commonly used and for good reason. I’m just having a
hard time applying the thought process to my specific setup.
As long as your backend servers aren’t limited on the number of
connections they can handle, best aproach would be to keep
proxy_buffering switched on, but switch off disk buffering using
proxy_max_temp_file_size.
Could you explain why this approach is not suitable for case when
backend servers are limited on number of connections.
On Tue, Sep 03, 2013 at 10:39:49AM -0400, bkosborne wrote:
the default is set to on:
and proxy server for the entire duration of the request, correct?
hard time applying the thought process to my specific setup.
As long as your backend servers aren’t limited on the number of
connections they can handle, best aproach would be to keep
proxy_buffering switched on, but switch off disk buffering using
proxy_max_temp_file_size.
On Wed, Sep 04, 2013 at 07:12:22AM +0300, wishmaster wrote:
[…]
when backend servers are limited on number of connections.
If backends are connection-bound, in many cases it’s more
effective to buffer responses to disk instead of keeping backend
connections busy.
On Wed, Sep 04, 2013 at 10:45:12AM -0400, bkosborne wrote:
Hmm okay, so that would essentially buffer as much as it can in RAM (which
really wouldn’t be much based on the default buffer sizes). Once that in
memory buffer becomes full, then what happens? It starts sending the data to
the client thats in the buffer as well any anything that isn’t?
When all buffers are full, nginx will stop reading data from the
upstream server till some buffers are sent to the client.
Hmm okay, so that would essentially buffer as much as it can in RAM
(which
really wouldn’t be much based on the default buffer sizes). Once that in
memory buffer becomes full, then what happens? It starts sending the
data to
the client thats in the buffer as well any anything that isn’t?
On Wed, Sep 04, 2013 at 11:40:27AM -0400, bkosborne wrote:
Why not just turn off buffering completely?
There are at least three reasons:
Turning off buffering will result in more CPU usage (and worse
network utilization in some cases).
It doesn’t work with limit_rate (not even talking about
proxy_cache which implies disk buffering).
Even small memory buffering saves some backend connections, and
you can tune number/size of buffers used based on the available
memory.
General recommendation is to avoid switching off proxy_buffering
unless your application really needs it, e.g. it does some form of
HTTP low-bandwidth streaming and needs nginx to send data to
a client immediately.