Nginx - Google Summer of Code ideas

oh and if Grzegorz decides on making his fastcgi-to-anything adapter
perhaps it could handle the ajp13 stuff too? and could technically
replace the need for php-fpm?

Grzegorz and Andrei should get together. Maybe the mod_wsgi guy too.
Look at some sort of fastcgi-to-anything adapter so that any webserver
can benefit as long as it speaks fastcgi. I’m going to vote for CGI
first (since php-fpm exists) :wink:

C. wrote:

We have about a day or so to collect ideas and add stuff to the
application if it makes sense. I’m on irc if anyone wants to say hi and
talk about ideas.

Cheers,

./Christopher

Hi C.

Participating in the SoC is a definitively good idea.

What about implementing async disk IO ?

  1. i still do not get this need. use central session management. many
    options for it…

On Thu, Mar 12, 2009 at 4:51 PM, Michael Baudino <
[email protected]> wrote:

What’s wrong with aio_write() and aio_read()?

  • Merlin

Hello!

On Thu, Mar 12, 2009 at 06:40:13PM -0700, Merlin wrote:

Michael Baudino

What’s wrong with aio_write() and aio_read()?

Well, you really want to hear? There is a couple of issues:

  1. There is no good standard method of notification. Per POSIX
    it’s uses signals to notify process about completed operations.
    Under FreeBSD notifications is possible over kqueue, and probably
    other OSes have something too, but it’s anyway will require some
    non-trivial porting.

  2. It’s usually implemented as a thread pool within OS kernel (at
    least FreeBSD and Linux implementations AFAIK), and usually have
    some limits administrator should be aware of (including maximum
    number of io requests system may queue).

  3. It’s not (yet) supported by nginx.

Personally I think that AIO interface is wierd and it would be
much better to support O_NONBLOCK on normal file descriptors. But
it’s not likely to happen in the near future. :slight_smile:

So currently AIO is the only disk async IO interface we have, and
it would be good thing to implement it’s support in nginx.

Maxim D.

Hi,

but AIO could be serious boost for large out-of-cache files, because it eliminates blocking.

As part of a project I’m working on, I’m going to implement a generic,
embeddable, lightweight, disk-and-memory object caching engine, that
would use disk AIO for the disk operations. It would basically be
similar to the role of memcached, but could be embedded (as well as
accessed via sockets) and would include disk caches too (if you want to
use that feature).

This could be used both for page caching and for storing other data
objects, and could be optimized to store the most frequently used
objects in memory.

For this, I’ll be using AIO for the disk accesses. I don’t know when
I’ll be doing this particular feature (some time within the next few
months probably), but if no-one else has already done so, I’m happy to
write some AIO code for Nginx in the process.

Here I disagree: linux kernel supports notification via eventfd since 2.6.18. eventfd syscall is available as of glibc 2.8, which is a part of e.g. ubuntu intrepid distribution.

Valery, have you done performance comparisons between eventfd and
aio_read() etc? Which one fared better in your experience?

Cheers,

Marcus.

Hello!

On Fri, Mar 13, 2009 at 02:21:14PM +0000, Valery K. wrote:

Hi C.
What’s wrong with aio_write() and aio_read()?

Personally I’ve tested this interface and it looks fine to me.

Disagree with what? I’ve said there are interfaces, but they
aren’t standard. You’ve just added one more example of
non-standard interface.

  1. It’s usually implemented as a thread pool within OS kernel (at
    least FreeBSD and Linux implementations AFAIK), and usually have
    some limits administrator should be aware of (including maximum
    number of io requests system may queue).

It is unclear whether kernel thread pool is a disadvantage or not. I don’t think it is reasonable to use disk AIO as a primary mean to serve files, I think it won’t make server faster than while using sendfile, but AIO could be serious boost for large out-of-cache files, because it eliminates blocking.

With such considerations kernel thread pool doesn’t seem to be a big hassle.

Yes. I’ve mentioned it just to show that AIO interface isn’t
perfect and should be used with care.

I actually think that AIO support in nginx would be really great.
I’ve just here to say that it’s not trivial task.

Maxim D.

p.s. As discussed some time ago on russian mailing list, probably
the best way will be to combine sendfile() with SF_NODISKIO and
AIO reading of 1 byte to place some file pages to cache (if
sendfile() returns EBUSY). Just mentioning it here to make sure
this idea (Igor’s one AFAIR) won’t be lost if someone finally
start digging into AIO support.

----- “Maxim D.” [email protected] wrote:

  1. There is no good standard method of notification. Per POSIX
    it’s uses signals to notify process about completed operations.
    Under FreeBSD notifications is possible over kqueue, and probably
    other OSes have something too, but it’s anyway will require some
    non-trivial porting.

Here I disagree: linux kernel supports notification via eventfd since
2.6.18. eventfd syscall is available as of glibc 2.8, which is a part of
e.g. ubuntu intrepid distribution.

Personally I’ve tested this interface and it looks fine to me.

  1. It’s usually implemented as a thread pool within OS kernel (at
    least FreeBSD and Linux implementations AFAIK), and usually have
    some limits administrator should be aware of (including maximum
    number of io requests system may queue).

It is unclear whether kernel thread pool is a disadvantage or not. I
don’t think it is reasonable to use disk AIO as a primary mean to serve
files, I think it won’t make server faster than while using sendfile,
but AIO could be serious boost for large out-of-cache files, because it
eliminates blocking.

With such considerations kernel thread pool doesn’t seem to be a big
hassle.

----- “Maxim D.” [email protected] wrote:

aren’t standard. You’ve just added one more example of
non-standard interface.

I disagree that there is no good interface. I think they are all as best
as they could be at the moment. After all kqueue notification of AIO
completion isn’t POSIX compliant.

because it eliminates blocking.

With such considerations kernel thread pool doesn’t seem to be a big
hassle.

Yes. I’ve mentioned it just to show that AIO interface isn’t
perfect and should be used with care.

I actually think that AIO support in nginx would be really great.
I’ve just here to say that it’s not trivial task.

I can imagine.

Maxim D.

p.s. As discussed some time ago on russian mailing list, probably
the best way will be to combine sendfile() with SF_NODISKIO and
AIO reading of 1 byte to place some file pages to cache (if
sendfile() returns EBUSY). Just mentioning it here to make sure
this idea (Igor’s one AFAIR) won’t be lost if someone finally
start digging into AIO support.

I remember it.

Hello!

On Fri, Mar 13, 2009 at 03:36:44PM +0000, Valery K. wrote:

Personally I’ve tested this interface and it looks fine to me.

Disagree with what? I’ve said there are interfaces, but they
aren’t standard. You’ve just added one more example of
non-standard interface.

I disagree that there is no good interface. I think they are all as best as they could be at the moment. After all kqueue notification of AIO completion isn’t POSIX compliant.

So it looks like you disagree with yourself, since I’ve never said
that there is no good interface. :slight_smile:

I’ve said quite a different thing: standard interface as defined
by POSIX isn’t good.

FreeBSD interface with AIO notifications via kqueue is good,
eventfd under linux probably is good too. But they are both
non-standard (and non-portable). So nginx have to support many
notification interfaces for various OSes.

Maxim D.

Valery K. wrote:

Actually, I have a draft of patch of AIO support for nginx with kqueue notifications, but I don’t have FreeBSD box at the moment :frowning:

I may send it to anyone who is interested.

Hello,

Actually, I am interested in such a patch (or “draft of a patch”).
I can’t promise I’ll work on it as my schedule if far too busy already,
but I’d love to see what’s going on with it.

Would you post it on this mailing list (as a new thread, maybe, we’re
hijacking the Google SoC here), or in the brand new forum :wink:

I agree; I am interested and will probably tinker with it, but I can’t
promise anything more than that. A new thread here would be perfect.

On Fri, Mar 13, 2009 at 11:04 AM, Michael Baudino <

----- “Maxim D.” [email protected] wrote:

I disagree that there is no good interface. I think they are all as
best as they could be at the moment. After all kqueue notification of
AIO completion isn’t POSIX compliant.

So it looks like you disagree with yourself, since I’ve never said
that there is no good interface. :slight_smile:

I’ve said quite a different thing: standard interface as defined
by POSIX isn’t good.

The only drawback I see is that it doesn’t include queuing interface,
like epoll. This damages the whole idea.

FreeBSD interface with AIO notifications via kqueue is good,
eventfd under linux probably is good too. But they are both
non-standard (and non-portable). So nginx have to support many
notification interfaces for various OSes.

Well, nginx has to support several asynchronous socket IO interfaces.
And since AIO interfaces are parallel to socket IO, this doesn’t make
things significantly more complicated.

Actually, I have a draft of patch of AIO support for nginx with kqueue
notifications, but I don’t have FreeBSD box at the moment :frowning:

I may send it to anyone who is interested.

Michael Baudino wrote:

but I’d love to see what’s going on with it.

Would you post it on this mailing list (as a new thread, maybe, we’re
hijacking the Google SoC here), or in the brand new forum :wink:

Here is what I have.

mike ha scritto:

[…]

features:

  • mod_svn
    The only idea of implementing mod_svn from scratch in Nginx is crazy :).

why is that? isn’t it just DAV? it just needs to support some more
OPTIONS commands or something? (I could be off my rocker, I thought I
read that somewhere)

Its because DAV is not simple to implement.

and what, have an /etc/nginx/upstreams.conf file that i manually
update and kill -HUP nginx or whatever appropriate signal to reload
every time i notice an upstream going up or down?

Right.

another issue is nginx is not ‘smart’ so the healthchecking would need
to be more than just tcp port 80 is open… i guess that’s where
external things come in to play. but simplifying the software stack
would be amazing, and it could be an -optional- module in nginx :slight_smile:

If this requires some internal rewrite of Nginx, then I’m not sure it is
a good idea.

[…]

Just use an efficient log parser; one, as an example, that is able to
parse multiple file at a time.

[…]

Regards Manlio

mike ha scritto:

Sorry for the late response.

I’d like to say first - during brainstorming there are no bad ideas :slight_smile:

On Sat, Mar 14, 2009 at 8:00 AM, Manlio P.
[email protected] wrote:

Its because DAV is not simple to implement.

But nginx already does DAV. I use it for mogilefs without a problem.

Nginx implements the “simple” part of DAV.

[…]

Regards Manlio

I’d like to say first - during brainstorming there are no bad ideas :slight_smile:

On Sat, Mar 14, 2009 at 8:00 AM, Manlio P.
[email protected] wrote:

Its because DAV is not simple to implement.

But nginx already does DAV. I use it for mogilefs without a problem.

What needs to be determined is what is missing to make this happen.
Obviously there are some limitations, but I am sure it can be done.

This does not need to be done with Nginx.
Just use a pre existing healthchecking software with each of the upstream
servers.

and what, have an /etc/nginx/upstreams.conf file that i manually
update and kill -HUP nginx or whatever appropriate signal to reload
every time i notice an upstream going up or down?

Right.

That just seems … messy. Although if the avahi/zeroconf idea gets
mixed in this might become a moot point as I understand it it would be
discovering and adding/removing automagically.

If this requires some internal rewrite of Nginx, then I’m not sure it is a
good idea.

That’s why it’s thrown out there as an idea.

I’m sure it can be done with a third party module. After all it’s just
issuing an HTTP request to an upstream and getting back a response,
and then removing it from the pool.

upstream backend {
server backend1.example.com weight=5 check=tcp;
server backend2.example.com:8080 check=response
url=http://foo.com/health.php expect=“hello world”;
server unix:/tmp/backend3;
}

etc?

check=tcp is simple, check=response would expect a plaintext response
back.

Or, those attributes could be cleaned up or put into it’s own type of
block:

response foo {
url http://foo.com/health.php
expect “hello world”;
}

block, and check=response response=@foo;

If would need to be smart enough to pick out the host piece so it can
issue the proper Host: when connecting to the internal IPs of the
servers.

On Sat, Mar 21, 2009 at 4:43 PM, Manlio P.
[email protected] wrote:

Nginx implements the “simple” part of DAV.

If it implemented full DAV there might not be a request for mod_svn
then right? :slight_smile:

So… that’s the request. Make it support more of DAV. It can be an
optional extension to enable (I think dav already is anyway)

:slight_smile: