How do proxy_module response buffering options work?

I’m trying to understand how the different proxy_module response
buffering options work. The current documentation is a little bit vague.
Here’s how I currently think buffering works:

Nginx allocates a number of separate buffers for buffering the response.
They are probably used as a ring for efficiency reasons.

The first buffer is used to store the response header, and must be large
enough to store the header. I’m guessing this is because the proxy
module’s response header parser can only operate on a contiguous block
of memory. The size of this first buffer can be customized through
proxy_buffer_size, but it has no effect on all the other buffers.

If proxy_buffering is turned on then Nginx will fill all proxy buffers
with data, either until the upstream sends EOF or until the buffers are
full. When one of those conditions occur Nginx will flush the data to
the client. If the client does not read data quickly enough Nginx will
write the unflushed buffer data to disk and will flush them when the
client can read more data. This is repeated until the upstream sends
EOF.

If proxy_buffering is turned off then Nginx will also fill all proxy
buffers. However it tries to immediately flush the data to the client.
If the upstream sends data faster than the client can read then the
proxy buffers will eventually be full and upstream will be throttled
until the client can read more data.

Is this correct? There’s also proxy_busy_buffers_size but I have no idea
what that is.

Hello!

On Sun, Apr 24, 2011 at 04:49:23PM +0200, Hongli L. wrote:

of memory. The size of this first buffer can be customized through
proxy_buffer_size, but it has no effect on all the other buffers.

If proxy_buffering is turned on then Nginx will fill all proxy buffers
with data, either until the upstream sends EOF or until the buffers are
full. When one of those conditions occur Nginx will flush the data to
the client. If the client does not read data quickly enough Nginx will
write the unflushed buffer data to disk and will flush them when the
client can read more data. This is repeated until the upstream sends
EOF.

Yes.

Additionally, there is proxy_max_temp_file_size, which controls how
much data may be written to disk. Once temp file size becomes
bigger - nginx pauses reading data from upstream until data from
temporary file is sent to client.

If proxy_buffering is turned off then Nginx will also fill all proxy
buffers. However it tries to immediately flush the data to the client.
If the upstream sends data faster than the client can read then the
proxy buffers will eventually be full and upstream will be throttled
until the client can read more data.

Yes. But only main buffer (proxy_buffer_size) will be used to
read response.

Is this correct? There’s also proxy_busy_buffers_size but I have no idea
what that is.

Busy buffers are buffers which are already passed downstream but
not yet completely send (and hence can’t be reused). The
proxy_busy_buffers_size directive limits maximum total size of
such buffers and thus allows remaining buffers to be used to read
upstream response (and spool it to disk if needed).

Maxim D.

Maxim D. wrote in post #994788:

Busy buffers are buffers which are already passed downstream but
not yet completely send (and hence can’t be reused). The
proxy_busy_buffers_size directive limits maximum total size of
such buffers and thus allows remaining buffers to be used to read
upstream response (and spool it to disk if needed).

How can this be? Isn’t there only a single buffer at any time that’s
only partially passed downstream?

The proxy buffers are per request, not global, right?

Hello!

On Mon, Apr 25, 2011 at 12:07:30AM +0200, Hongli L. wrote:

Maxim D. wrote in post #994788:

Busy buffers are buffers which are already passed downstream but
not yet completely send (and hence can’t be reused). The
proxy_busy_buffers_size directive limits maximum total size of
such buffers and thus allows remaining buffers to be used to read
upstream response (and spool it to disk if needed).

How can this be? Isn’t there only a single buffer at any time that’s
only partially passed downstream?

No, nginx passes downstream chain of buffers, not a single buffer.

The proxy buffers are per request, not global, right?

Right.

Maxim D.

Maxim D. wrote in post #994792:

No, nginx passes downstream chain of buffers, not a single buffer.

I understand this, but I don’t understand how there can be more than 1
buffer at a time that’s only partially passed downstream. I had in mind
that it works like this:

Suppose that Nginx is configured with 4 buffers, each 100 bytes in size.
The upstream response is 400 bytes.
Bytes 0-99 from the response are put in buffer 1, bytes 100-199 are put
in buffer 2, bytes 200-299 are put in buffer 3, bytes 300-399 are put in
buffer 4.
Nginx flushes buffer 1 downstream, thus buffer 1 is now a “busy buffer”
while all the other ones are not. Only when buffer 1 is entirely flushed
will Nginx continue to flush buffer number 2 (buffer 1 is now marked
“ready” of some sort so that it may be reused.)

How can there be multiple busy buffers? Is my description correct?

Hello!

On Mon, Apr 25, 2011 at 10:27:05AM +0200, Hongli L. wrote:

in buffer 2, bytes 200-299 are put in buffer 3, bytes 300-399 are put in
buffer 4.
Nginx flushes buffer 1 downstream, thus buffer 1 is now a “busy buffer”
while all the other ones are not. Only when buffer 1 is entirely flushed
will Nginx continue to flush buffer number 2 (buffer 1 is now marked
“ready” of some sort so that it may be reused.)

How can there be multiple busy buffers? Is my description correct?

No. nginx passes downstream (or, more precisely, passes to output
filter chain) multiple buffers, and these buffers can’t be reused
unless completely send (see below).

E.g. with 4 buffers as in your example something like this
happens:

  1. nginx gets 0-99 from upstream to buf 1, and calls output filter
    on it.

From this point buf 1 is busy - it can’t be touched unless
completely send, as some filter may have been already modified it
(e.g. converted charset in charset filter) or done some other
related work on it (e.g. added chunk-size as per chunked transfer
encoding). We can’t spool it to disk and so on.

  1. nginx gets 100-199 from upstream to buf 2, and calls output
    filter on it. The same as the above now applies to buf 2 as well.

Buffers 1 and 2 are now linked somewhere along output filter chain
(most likely in writer filter, waiting for client to allow sending
of additional data).

(Both (1) and (2) may also happen at the same time with single
output filter call if data to both buffers happened to be read at
once.)

Somewhere near here proxy_busy_buffers_size starts to play it’s
role: once data is read to buffers 3 and 4 - they are not passed
to output filter (assuming proxy_busy_buffers_size is 200) and
nginx may spool data to disk and reuse these buffers to read more
data from upstream.

(Obviously the case with 400-byte response isn’t intresting: if we
have enough buffers to read the whole response all buffers will be
just passed to output filters. All of the above is actually
needed when the response is bigger than the available buffers.)

Maxim D.

Hello!

On Thu, Nov 17, 2011 at 09:52:57PM -0500, liuzhida wrote:

data which can’t be written into buffers will be written into temp_file?
Yes, as long as it’s not possible to write data to client fast
enough.

for every single response? the proxy_max_temp_file_size are global or
per request?

Per request. It limits maximum size of a temporary file used to
buffer a request.

for example the upstream response is 2000 bytes. Nginx is configured
with 4 buffers, each 100 bytes in size. is that nginx deal with this
single response with written the 400 bytes of response into buffer and
written the 1600 bytes into temp_file?

Roughly yes. (There are some nuances though, as you can’t write
data to file without using in-memory buffers, and that’s why some
memory buffers will be reserved for reading response from upstream
and writing it to disk. And again, this all assumes it’s not
possible to write anything to client, while 2000 bytes usually
just fit into socket buffer and will be passed to kernel
immediately even if client isn’t reading at all.)

and after the buffer is
completely send, Nginx will read the rest of response data from the
temp_file and written into buffer?

It will follow usual procedure to send file-backed data (the same
one which is used for static files), i.e either it will use
sendfile() or read data to output_buffers and send them.

Maxim D.

Additionally, there is proxy_max_temp_file_size,
which controls how
much data may be written to disk. Once temp file
size becomes
bigger - nginx pauses reading data from upstream
until data from
temporary file is sent to client.

do you mean if a response size is larger than all the proxy buffers,
after some part of the response been written into buffers, the rest of
data which can’t be written into buffers will be written into temp_file?
for every single response? the proxy_max_temp_file_size are global or
per request?
for example the upstream response is 2000 bytes. Nginx is configured
with 4 buffers, each 100 bytes in size. is that nginx deal with this
single response with written the 400 bytes of response into buffer and
written the 1600 bytes into temp_file? and after the buffer is
completely send, Nginx will read the rest of response data from the
temp_file and written into buffer?

Thx

liuzhida

Posted at Nginx Forum: