Does Nginx block on file IO?

Hi.

I was always under the impression that Nginx is non-blocking for file
IO. Then I was told it wasn’t.

I’m considering using Nginx to serve static images. Pretty much every
connection will result in a file IO. If Nginx blocks for file IO, then
using Nginx here wouldn’t be any better than using Apache, right? Every
connection will lead to a file IO which blocks the entire Nginx process.
So to serve 500 concurrent connections I’ll need 500 Nginx processes.

Would Nginx work in such a use case? Any tips on how to use Nginx for
serving static media?

Thanks.

Posted at Nginx Forum:

Andy Wrote:

I was always under the impression that Nginx is
non-blocking for file IO. Then I was told it
wasn’t.

Actually AIO support was added not so long ago. It can be non-blocking
for file IO on systems with AIO.

If Nginx blocks for file IO, then
using Nginx here wouldn’t be any better than using
Apache, right?

How is it even related? There is a difference between getting a file
from disk and serving it to the client from buffer. You can get it
really fast and block for a very little time but than send it to the
client very slowly. So asynchronous front-end would be a really good
idea.
Other than that you might find that neither nginx nor apache can serve
your images fast enough from any common filesystem. You will need
something like Bitcask to go farther. The idea is to get an entire image
to the buffer in a single disk seek.

Posted at Nginx Forum:

Hi,

I was always under the impression that Nginx is non-blocking for file
IO. Then I was told it wasn’t.

That depends on your configuration on on capabilities of your operating
system. Asynchronous file I/O is available on recent versions of Linux,
Solaris, DragonFly, FreeBSD, NetBSD, MacOSX.

I’m considering using Nginx to serve static images. Pretty much every
connection will result in a file IO.

No, it won’t. Recently served files should be in your OS’s buffer cache.

If Nginx blocks for file IO, then
using Nginx here wouldn’t be any better than using Apache, right? Every
connection will lead to a file IO which blocks the entire Nginx process.
So to serve 500 concurrent connections I’ll need 500 Nginx processes.

If that would be the case (buffer cache hit rate would be very low) and
you
would be using magnetic disks, then yes, you’d be better off with a web
server than uses per-client lightweight processes or threads.

Best regards,
Piotr S. < [email protected] >

Hi,

How is it even related? There is a difference between getting a file
from disk and serving it to the client from buffer. You can get it
really fast and block for a very little time but than send it to the
client very slowly.

It’s actually very related, because with event-driven programming,
you’re
blocking all worker’s clients on the single disk I/O.

Best regards,
Piotr S. < [email protected] >

Piotr S. Wrote:

It’s actually very related, because with
event-driven programming, you’re
blocking all worker’s clients on the single disk
I/O.

I know, but imagine how much memory you can save and use for file system
cache if you are not running apache. Which can make fat more greater
impact on your performance. You can’t just say that nginx is the same as
apache just because it’s blocking on file I/O.
So still, it depends on the type of load you are having. And if you are
not serving 100 megabytes of data to 500 clients but a lot more you are
most likely to get better results with asynchronous server even with
blocking file I/O.

Posted at Nginx Forum:

On Mon, 2011-05-02 at 07:29 +0200, Piotr S. wrote:

Hi,

How is it even related? There is a difference between getting a file
from disk and serving it to the client from buffer. You can get it
really fast and block for a very little time but than send it to the
client very slowly.

It’s actually very related, because with event-driven programming, you’re
blocking all worker’s clients on the single disk I/O.

If you expect file IO to block, use more nginx processes than cores. I
dont think that async IO on Linux is really worth using, although I will
do some benchmarks. You might try switching to FreeBSD as well.

I think that node.js currently uses a thread pool for async IO on Linux
(possibly all platforms, except maybe Windows), with sync IO in the
background. Linux kernel aio was designed for programs like databases
that manage their own buffer caches, not for normal file IO. If you dont
cache the results it will be slow (unless anything has changed recently,
will do some testing).

Justin

On Sun, 2011-05-01 at 20:26 -0400, Andy wrote:

Hi.

I was always under the impression that Nginx is non-blocking for file
IO. Then I was told it wasn’t.

I’m considering using Nginx to serve static images. Pretty much every
connection will result in a file IO. If Nginx blocks for file IO, then
using Nginx here wouldn’t be any better than using Apache, right? Every
connection will lead to a file IO which blocks the entire Nginx process.
So to serve 500 concurrent connections I’ll need 500 Nginx processes.

You cannot take a single measure such as concurrent requests independent
of requests per second and make a good prediction about performance
requirements. If you have 500 concurrent requests, and each request
takes an average of 1ms, then you could conceivably serve all 500
requests within 0.5 seconds with a single worker, which is quite
reasonable. People have reported serving 10K requests per second using
Nginx on rather modest hardware (laptops even). You’ll probably find
that your OS is the bottleneck, not Nginx (and you should focus your
performance tuning at that level - TCP buffers and the like).

It’s also worth pointing out that if you currently have 500 concurrent
requests on Apache, that number may actually decrease with Nginx,
assuming requests are finished faster (i.e. if Nginx finishes each
request twice as fast as Apache, you’d only see 250 concurrent requests,
while requests per second would double). Obviously that’s simplified,
but hopefully clarifies my point.

In any case, no need to have a worker per request, just start with the
recommendation of 1 worker per core and tune from there. It’s true that
the concurrent number of requests is limited by the number of workers
you have (the remaining requests will be served serially as each worker
is freed), but in practice, it becomes academic. Serialization is going
to happen at some level, even if you have a threaded server, so
basically software threads buy you little more than the feeling of
concurrency at the cost of massive amounts of memory.

Would Nginx work in such a use case? Any tips on how to use Nginx for
serving static media?

Serving static content is actually where Nginx has been demonstrated to
outperform Apache by a wide margin, in no small part because it leaves
lots of memory for filesystem caching, and also because it causes less
context-switches than a threaded server.

At the end of the day, you should do some performance testing, since
performance is going to depend a lot on factors that will only be
revealed on your particular setup and your particular data (number and
size of files, cache misses, etc).

Regards,
Cliff

jjjx128 Wrote:

And if you are not serving 100 megabytes
of data to 500 clients but a lot more you are most
likely to get better results with asynchronous
server even with blocking file I/O.

Sorry, bad example, in this case your data will fit in file system cache
and you still will be better off with nginx bacuase of all the context
switches you save. Anyway there might be some very special case where
disks are overloaded and access patterns are truly random. Although this
never happens in real world.

Posted at Nginx Forum: