Optimize nginx for uploading/download huge files

Hello everyone,

Firstly, I’m sorry for posting the same question here and there (How to
forum).
It is urgent issue to me.

Could you give me a tip for which directives and values I have to set
for the optimization?

I installed nginx on virtual machine as reverse proxy server.
It checks the authentication and bypasses the client request to the
backend servers.
The most transactions are to upload/download files, from 1M to 1G…

The CPU usage hits 100% when the 1M file download request reaches around
25 TPS…
around 20 TPS for upload request…

I set client_body_buffer_size, proxy_buffer_size, proxy_buffers and so
on…
But not so good yet… :frowning:

client_body_buffer_size is set to 2M and it helps to increase TPS for
uploading, but not enough.
proxy_buffer_size = 1M and proxy_buffers = 4 1M help to increase TPS for
downloading little bit, but not so much…
Other directives are set by its default.

The below is the result of profiling with google_proftools_module.
As you can see, the most time is spent on readv() and writev().

Please give me any advice or your experience.
Thank you for your time.

Total: 22244 samples
11928 53.6% 53.6% 11930 53.6% readv
8212 36.9% 90.5% 8212 36.9% writev
907 4.1% 94.6% 907 4.1% recv
359 1.6% 96.2% 359 1.6% __connect_nocancel
248 1.1% 97.3% 248 1.1% __close_nocancel
35 0.2% 97.5% 46 0.2% _IO_str_pbackfail
34 0.2% 97.7% 34 0.2% __write_nocancel
31 0.1% 97.8% 31 0.1% socket
28 0.1% 97.9% 28 0.1% memset
and
Total: 64475 samples
34117 52.9% 52.9% 34117 52.9% writev
14173 22.0% 74.9% 14181 22.0% readv
11981 18.6% 93.5% 11981 18.6% recv
1046 1.6% 95.1% 1049 1.6% __connect_nocancel
878 1.4% 96.5% 878 1.4% __close_nocancel
343 0.5% 97.0% 343 0.5% epoll_wait
160 0.2% 97.2% 160 0.2% __write_nocancel
89 0.1% 97.4% 89 0.1% __xstat64
81 0.1% 97.5% 115 0.2% _IO_str_pbackfail

Here is the information of my VM:
ronjpark@PSEProxy:~/release/REL.0.13$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 2
model name : QEMU Virtual CPU version 0.14.1
stepping : 3
cpu MHz : 2393.998
cache size : 4096 KB
fpu : yes
fpu_exception : yes
cpuid level : 4
wp : yes
flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36
clflush mmx fxsr sse sse2 syscall nx lm up rep_good nopl pni cx16 popcnt
hypervisor lahf_lm
bogomips : 4787.99
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:

ronjpark@PSEProxy:~/release/REL.0.13$ cat /proc/meminfo
MemTotal: 4057052 kB
MemFree: 2651832 kB
Buffers: 80984 kB
Cached: 1100692 kB
SwapCached: 3856 kB
Active: 325744 kB
Inactive: 910176 kB
Active(anon): 48884 kB
Inactive(anon): 5684 kB
Active(file): 276860 kB
Inactive(file): 904492 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 2095100 kB
SwapFree: 2082688 kB
Dirty: 4 kB
Writeback: 0 kB
AnonPages: 51356 kB
Mapped: 15740 kB
Shmem: 324 kB
Slab: 66252 kB
SReclaimable: 56796 kB
SUnreclaim: 9456 kB
KernelStack: 1160 kB
PageTables: 4124 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 4123624 kB
Committed_AS: 176112 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 28816 kB
VmallocChunk: 34359707236 kB
HardwareCorrupted: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 42996 kB
DirectMap2M: 4151296 kB
ronjpark@PSEProxy:~/release/REL.0.13$ uname -a
Linux PSEProxy 2.6.38-8-server #42-Ubuntu SMP Mon Apr 11 03:49:04 UTC
2011 x86_64 x86_64 x86_64 GNU/Linux

Posted at Nginx Forum:

On Tue, Jan 3, 2012 at 5:35 PM, ronjpark [email protected] wrote:

It checks the authentication and bypasses the client request to the

Thank you for your time.
28 0.1% 97.9% 28 0.1% memset
81 0.1% 97.5% 115 0.2% _IO_str_pbackfail
cpu MHz : 2393.998
cache_alignment : 64
Inactive: 910176 kB
AnonPages: 51356 kB
CommitLimit: 4123624 kB
DirectMap4k: 42996 kB
DirectMap2M: 4151296 kB
ronjpark@PSEProxy:~/release/REL.0.13$ uname -a
Linux PSEProxy 2.6.38-8-server #42-Ubuntu SMP Mon Apr 11 03:49:04 UTC
2011 x86_64 x86_64 x86_64 GNU/Linux

The bottleneck is neither CPU nor memory. VM disk performance will be
bad. Any throughput you get on the virtual disk will degrade as each
concurrent upload client triggers reads and writes. This is one case
where you need a SAN, ZFS filesystem, or an SSD drive; configure with
large block size if files are >1GB. Even server class hard drives will
thrash and slow down everything with enough clients contending for the
disk. Remember, in relative time, if a cpu operation takes a second,
an operation on the disk takes one month, under ideal conditions, and
you are giving nginx less than ideal conditions.

Stefan C.
http://scaleengine.com/contact

stefancaunter Wrote:

drive; configure with
Stefan C.
http://scaleengine.com/contact

Thank you Stefan.
I also doubted what you said and tested with 1MB file.

client_body_buffer_size is set to 2M and proxy_buffer_size = 2M also.
But it doesn’t work.

As my understainding, nginx doesn’t make disk I/O for this file
transaction because of enough big buffer size. Right?
Please fix me if I’m wrong.

Posted at Nginx Forum:
http://forum.nginx.org/read.php?2,220778,220818#msg-220818

Hello!

On Wed, Jan 04, 2012 at 07:38:08PM -0500, ronjpark wrote:

writes. This is one case
you are giving nginx less than ideal conditions.

As my understainding, nginx doesn’t make disk I/O for this file
transaction because of enough big buffer size. Right?
Please fix me if I’m wrong.

Yes.

To limit disk I/O for big requests/responses there are following
options available:

  1. Using larger buffers, notably client_body_buffer_size,
    proxy_buffer_size, proxy_buffers. This is basically what you’ve
    already done.

  2. For responses you may also completely disable disk buffering
    using “proxy_max_temp_file_size 0”. This implies that nginx won’t
    be able to read full response from a backend though, and
    connection to a backend will be busy for a time needed to client to
    download a response.

Directive description may be found here:
http://wiki.nginx.org/HttpProxyModule#proxy_max_temp_file_size

Maxim D.