Nginx flv stream gets too slow on 2000 concurrent connections

Hello,

    We are using nginx to serve large size of static files i.e 

jpg,flv
and mp4 . Nginx stream works very well on 1000~1500 concurrent
connections
but whenever connections exceeded to 2000~2200, stream gets too slow.
We’ve
five content server with following specification:-

Dual Quard Core (8cores/16threads)
RAM = 32G
HDD = Sas Hard-Raid 10

My nginx.conf config is given below :

user nginx;
worker_processes 16;
worker_rlimit_nofile 300000; #2 filehandlers for each connection;

#pid logs/nginx.pid;

events {
worker_connections 6000;
use epoll;
}
http {
include mime.types;
default_type application/octet-stream;
limit_rate 180k;
client_body_buffer_size 128K;
sendfile_max_chunk 128k;
server_tokens off; #Conceals nginx version
access_log off;
sendfile on;
client_header_timeout 3m;
client_body_timeout 3m;
send_timeout 3m;
keepalive_timeout 0;

If somebody can help me improving nginx config will be helpful to him. I
apologize for bad engish :smiley:

Have you checked HDD performance on the server in this periods with atop
or
iostat 1 ?
It’s very likely to be related with this since I guess there’s a lot’s
of
random reading on 2000 connections.

Posted at Nginx Forum:

Hello,

   Following is the output of vmstat 1 on 1000+ concurrent 

connections
:-

procs -----------memory---------- —swap-- -----io---- --system–
-----cpu-----
r b swpd free buff cache si so bi bo in cs us sy
id
wa st
0 0 0 438364 43668 31548164 0 0 62 23 3 0 5
0
95 0 0
0 0 0 437052 43668 31548520 0 0 1292 0 19763 1570 0
0
99 1 0
0 1 0 435316 43668 31549940 0 0 1644 0 20034 1537 0
0
99 1 0
1 0 0 434688 43676 31551388 0 0 1104 12 19816 1612 0
0
100 0 0
0 0 0 434068 43676 31552304 0 0 512 24 20253 1541 0
0
99 0 0
1 0 0 430844 43676 31553156 0 0 1304 0 19322 1636 0
0
99 1 0
0 1 0 429480 43676 31554256 0 0 884 0 19993 1585 0
0
99 0 0
0 0 0 428988 43676 31555020 0 0 1008 0 19244 1558 0
0
99 0 0
0 0 0 416472 43676 31556368 0 0 1244 0 18752 1611 0
0
99 0 0
2 0 0 425344 43676 31557552 0 0 1120 0 19039 1639 0
0
99 0 0
0 0 0 421308 43676 31558212 0 0 1012 0 19921 1595 0
0
99 0 0

This might be a stupid question ,which section should i focus from above
output to analyze if I/O is performing well or in heavy load?

Thanks

Sorry for above reply on wrong command. Following are the output of
iostat
1 :-

Linux 2.6.32-279.19.1.el6.x86_64 (DNTX005.local) 01/23/2013
x86_64 (16 CPU)

avg-cpu: %user %nice %system %iowait %steal %idle
1.72 2.94 0.46 0.11 0.00 94.77

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 20.53 1958.91 719.38 477854182 175484870

avg-cpu: %user %nice %system %iowait %steal %idle
0.06 0.00 0.13 0.19 0.00 99.62

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 30.00 1040.00 5392.00 1040 5392

avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.19 0.25 0.00 99.56

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 24.00 1368.00 104.00 1368 104

Thanks

The average size of each flv video is 60Mb+. We’ve five content servers
(nginx-1.2.1), each with 1Gbps port and 100TB bandwidth per month. Right
now each server is consuming 10~12 bandwidth per day and we’re going to
run
out of bandwidth on coming last days of month. However we limited every
connection to 180k, you can see limit_rate 180k; in nginx.conf file.

I am newbie to this field. Please correct me if i didn’t satisfy your
question regarding bandwidth. :slight_smile:

On Wed, Jan 23, 2013 at 5:39 PM, Dennis J. <

On 01/23/2013 10:43 AM, shahzaib shahzaib wrote:

events {
access_log off;
sendfile on;
client_header_timeout 3m;
client_body_timeout 3m;
send_timeout 3m;
keepalive_timeout 0;

If somebody can help me improving nginx config will be helpful to him. I
apologize for bad engish :smiley:

What’s the required bandwidth for the flv files? What is the bandwidth
of
the connection of the system? What is the bandwidth of the uplink to the
Internet?

Regards,
Dennis

Thanks for helping me out guyz but w’re already dealing with 3-app
servers
clustering with haproxy load balancer (For this video streaming site),
and
can’t afford another clustering type things. You’re talking about
Haproxy
load-balancer to split required flv files requests to different servers.
We’ll have to buy 5 more content servers for mirroring data between
every
two servers to split load-balancer requests but the problem is we’re
running out of budget. If you can provide me some alternate to this
problem?

If somebody can help me improving nginx config will be helpful to him. I
apologize for bad engish :smiley:

It is not always the nginx that needs tuning. What about OS?

Have you changed the default file descriptor limits? Something like
“Nginx
stream works very well on 1000~1500 concurrent connections“ might
indicate
you are hitting the default 1024 limit.
While since some kernel versions linux does autotuning still sometimes
tweaking it a bit helps a lot:
http://www.cyberciti.biz/faq/linux-tcp-tuning/ etc …

rr

File-discriptors are tweaked to 700000 in sysctl.conf. However i’ll
follow
above guide to tweak sysctl to better performance and will see if it
works.
Thanks for guiding me guyz.

It will not help you actualy but I had a similar experience.

My issue was due to the system event handle (epoll, kqueue…)

I noticed poor speed when hitting 2000 connections with haproxy. So I
switch to nginx + tcp module proxy. Same results…

But using haproxy + nginx (with two different event handler, I avoid
the speed problem). At the end, I prefered use ESXi and 2/3 VM and
split connections with DNS load balancing

Maybe you should take a look at this event handler problem and do some
tunning on kernel/OS. Nginx (maybe) isn’t the actual issue.

2013/1/23 shahzaib shahzaib [email protected]:

On Wed, Jan 23, 2013 at 5:39 PM, Dennis J.

We’ve
worker_processes 16;
include mime.types;
keepalive_timeout 0;

nginx Info Page

Geoffrey HARTZ

50% of it already been tweaked , i can send you sysctl.conf config if
you
ask for it. I think the problem is not kernals, it is something else.

Sketchboy, i sent you the output of only 1000 concurrent connections
because it wasn’t peak hours of traffic. I’ll send you the output of
iostat
1 when concurrent connections will hit to 2000+ in next hour. Please
keep
in touch cause i need to resolve this issue :frowning:

From your output I can see that it isn’t IO issue, I wish I could help
you
more.

Posted at Nginx Forum:

Following is the output of 3000+ concurrent connections on iostat 1
command
:-

avg-cpu: %user %nice %system %iowait %steal %idle
1.72 2.96 0.47 0.12 0.00 94.73

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 22.47 1988.92 733.04 518332350 191037238

avg-cpu: %user %nice %system %iowait %steal %idle
0.39 0.00 0.91 0.20 0.00 98.50

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 22.00 2272.00 0.00 2272 0

avg-cpu: %user %nice %system %iowait %steal %idle
0.46 0.00 0.91 0.07 0.00 98.57

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 23.00 864.00 48.00 864 48

avg-cpu: %user %nice %system %iowait %steal %idle
0.39 0.00 0.72 0.33 0.00 98.56

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 60.00 3368.00 104.00 3368 104

avg-cpu: %user %nice %system %iowait %steal %idle
0.20 0.00 0.65 0.20 0.00 98.95

Can you send us a 20+ lines of output from “vmstat 1” under this load?
Also, what exact linux kernel are you running (“cat /proc/version”)?


Following is the output of 2200+ concurrent connections and kernel
version
is 2.6.32 :-

Linux 2.6.32-279.19.1.el6.x86_64 (DNTX005.local) 01/23/2013
x86_64 (16 CPU)

avg-cpu: %user %nice %system %iowait %steal %idle
1.75 3.01 0.49 0.13 0.00 94.63

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 23.27 2008.64 747.29 538482374 200334422

avg-cpu: %user %nice %system %iowait %steal %idle
0.97 0.00 1.10 0.19 0.00 97.74

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 30.00 2384.00 112.00 2384 112

avg-cpu: %user %nice %system %iowait %steal %idle
0.13 0.00 0.52 0.13 0.00 99.22

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 21.00 1600.00 8.00 1600 8

avg-cpu: %user %nice %system %iowait %steal %idle
0.19 0.00 0.45 0.26 0.00 99.10

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 37.00 2176.00 8.00 2176 8

avg-cpu: %user %nice %system %iowait %steal %idle
0.45 0.00 0.58 0.19 0.00 98.77

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 24.00 1192.00 8.00 1192 8

avg-cpu: %user %nice %system %iowait %steal %idle
0.32 0.00 0.45 0.19 0.00 99.03

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 29.00 2560.00 8.00 2560 8

avg-cpu: %user %nice %system %iowait %steal %idle
0.32 0.00 0.65 0.19 0.00 98.84

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 35.00 2584.00 152.00 2584 152

avg-cpu: %user %nice %system %iowait %steal %idle
0.26 0.00 0.39 0.39 0.00 98.96

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 25.00 1976.00 8.00 1976 8

avg-cpu: %user %nice %system %iowait %steal %idle
0.32 0.00 0.52 0.39 0.00 98.77

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 33.00 1352.00 8.00 1352 8

avg-cpu: %user %nice %system %iowait %steal %idle
0.26 0.00 0.58 0.26 0.00 98.90

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 28.00 2408.00 8.00 2408 8

avg-cpu: %user %nice %system %iowait %steal %idle
0.45 0.00 0.65 0.06 0.00 98.84

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 37.00 1896.00 8.00 1896 8

avg-cpu: %user %nice %system %iowait %steal %idle
0.71 0.00 0.97 0.13 0.00 98.19

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 33.00 2600.00 64.00 2600 64

avg-cpu: %user %nice %system %iowait %steal %idle
0.32 0.00 0.65 0.26 0.00 98.77

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 20.00 1520.00 8.00 1520 8

avg-cpu: %user %nice %system %iowait %steal %idle
0.19 0.00 0.39 0.19 0.00 99.22

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 49.00 3088.00 80.00 3088 80

avg-cpu: %user %nice %system %iowait %steal %idle
0.26 0.00 0.91 0.26 0.00 98.58

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 48.00 1328.00 8.00 1328 8

avg-cpu: %user %nice %system %iowait %steal %idle
0.32 0.00 0.32 0.26 0.00 99.09

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 32.00 1528.00 8.00 1528 8

avg-cpu: %user %nice %system %iowait %steal %idle
0.45 0.00 0.58 0.39 0.00 98.58

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 35.00 1624.00 72.00 1624 72

avg-cpu: %user %nice %system %iowait %steal %idle
0.39 0.00 0.58 0.19 0.00 98.84

The box doesn’t seem to have problems with that kind of load, not even
the IO side is struggling, I guess the 31GB of page cache is your life
saver here.

I would check for recent events in dmesg output. Then I would analyze
the network side. Like how much is you eth0 loaded (nload)? Perhaps you
are simply saturating your Gbit/s pipe? what do you see with “ifconfig
eth0”? Did you talk to the network operators if this kind of load can
cause drops/packet loss?

Am 23.01.2013 um 20:03 schrieb shahzaib shahzaib
[email protected]:

And also the 20+ lines of vmstat are given below with 2.6.32 kernal :-

There was a thread recently (well, last year sometimes) with a link to a
blog in Chinese with sysctl-settings etc.
It had tunings for 2k concurrent connections.

Maybe somebody can dig it out?

The load(nload) of 1500+ concurrent connections with 1Gbps port is :
Curr:
988.95 MBit/s
##

## ## ## ## ## # Avg: 510.84 MBit/s

                                                               ## 

## ## ## ## ## # Min: 0.00 Bit/s

                                                               ## 

## ## ## ## ## # Max: 1005.17 MBit/s

                                                               ## 

## ## ## ## ## # Ttl: 10017.30 GByte

What should i see into dmesg to analyse the problem ? I’ll also send you
the nload when the traffic will hit to its peak, at this time its
average
traffic. The following is ifconfig eth0 output :-

eth0 Link encap:Ethernet HWaddr X:X:X:X:X:X
inet addr:X.X.X.X Bcast:X.X.X.X Mask:255.255.255.192
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:3713630148 errors:0 dropped:0 overruns:0 frame:0
TX packets:7281199166 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:260499010337 (242.6 GiB) TX bytes:10767156835559
(9.7
TiB)
Memory:fbe60000-fbe80000

No i didn’t concerned with network operators yet. And if someone can get
me
that chinese blog for setting 2k concurrent connections using
sysctl-settings. So far i used this guide to tune kernal.
http://www.cyberciti.biz/faq/linux-tcp-tuning/