Multiplexing FastCGI module

Rasmus_Andersson · December 19, 2009, 3:25pm

Hi.

I’ve written a multiplexing (concurrent requests through a single
backend) version of the 0.8 fastcgi module along with a FastCGI server
reference implementation:

Please let me know what you think and if you’re interested in merging
in this code upstream.

Thank you.

–
Rasmus Andersson

Rasmus_Andersson · December 19, 2009, 3:25pm

Hello!

On Sun, Dec 13, 2009 at 06:31:06PM +0100, Rasmus Andersson wrote:

Hi.

I’ve written a multiplexing (concurrent requests through a single
backend) version of the 0.8 fastcgi module along with a FastCGI server
reference implementation:

GitHub - rsms/afcgi: Asynchronous/multiplexing FastCGI for nginx (incl. ref server implementation)

Please let me know what you think and if you’re interested in merging
in this code upstream.

[Just a side note: there are lots of whitespace damage and style
violations in your code. It’s belived that following original
nginx style is good idea, makes reading diffs easier and improves
karma.]

I wasn’t able to find enough code there to keep backend
connections alive and share them between requests. Looks like the
code in question just assigns unique ids to requests sent to
fastcgi backend - while each request is still send in separate
connection…

On the other hand, it sets FASTCGI_KEEP_CONN flag and thus breaks
things as nginx relay on fastcgi application to close connection.

So basically it’s not something usefull right now. And it breaks
3 out of 4 fastcgi subtests in the test suite
(nginx-tests: log).

As to idea in general - I’m not really sure that request
multiplexing is great feature of fastcgi protocol. It complicates
things and shouldn’t be noticably faster than just using multiple
connections to the same fastcgi app (assuming fastcgi app is able
to handle connections asynchroniously). It would be intresting to
compare though.

Maxim D.

p.s. As keeping fastcgi connections alive appears to be
prerequisite for fastcgi multiplexing, here are some notes:

Keeping fastcgi connections alive isn’t really hard, but requires
a lot more than setting KEEP_CONN flag in fastcgi request. Most
notably - ngx_event_pipe code and upstream module code should be
modified to make sure buffering is somewhat limited and does not
relay on connection close as end signal. I’ve posted some preview
patches a while ago which makes fastcgi module able to keep
connections alive (with ngx_http_upstream_keepalive module), you
may want to take a look if you are going to continue your
multiplexing work.

Rasmus_Andersson · December 19, 2009, 3:25pm

Hello!

On Mon, Dec 14, 2009 at 12:42:41PM +0100, Rasmus Andersson wrote:

I’m well aware of this. The code (as you might have noticed) is in the
early stages. Cleanup etc will be done once I’ve got the general ideas
straight.

Yep. Probably it’s a good idea to post it for review once you
done cleanup? Unless you are going to run out of karma points
till it happens…

can see nginx log in debug mode – no “disconnect” or “connect”
I don’t see nginx running in debug mode on the screenshot in
question. Instead you use ngx_log_error(NGX_LOG_DEBUG, …) for
your messages which produce similar output (i.e. logs “[debug]”
level) but not in fact debug. And other debug messages are
obviously not there.

fcgi client 127.0.0.1 connected on fd 5
…
app_handle_beginrequest 0x100530
…
app_handle_beginrequest 0x100530
…

Multiple requests received and handled over a period of time over one
persistent connection from nginx.

But I guess I’m missing something?

Looks like. Or you are running not the code you published.

On the other hand, it sets FASTCGI_KEEP_CONN flag and thus breaks
things as nginx relay on fastcgi application to close connection.

The FastCGI server will still be responsible for closing the connection.

No. Once nginx sets FASTCGI_KEEP_CONN flag - it takes
this responsibility according to fastcgi spec.

But looks like you didn’t take the actual problem: nginx needs
fastcgi application to close connections after it finished sending
response. It uses connection close as flush signal. Without this
requests will hang.

So basically it’s not something usefull right now. šAnd it breaks
3 out of 4 fastcgi subtests in the test suite
(nginx-tests: log).

I wasn’t aware of those tests. In what way does it break the tests?
Would you please help me solve the issues in my code breaking these
tests?

As I don’t see how you code can work at all - it’s unlikely I’ll
be able to help. Tests just show that I’m right and it doesn’t
work at all (the only test that passes checks that HEAD request
returns no body…).

website become very popular and alot of your visitors have slow

Saving gigabytes of memory and tens of thousands of file descriptors.
Today, the only option is to build purpose-made nginx-modules or whole
http-servers running separately from nginx.

I say this with real-world experience. Would be awesome to implement
the complete FastCGI 1.0 spec in nginx and be the first web server to
support long-lived and slow connections with rich web apps!

Difference between fastcgi multiplexing and multiple tcp
connections to the same fastcgi server isn’t that huge. Note
well: same fastcgi server, not another one.

You may save two file descriptors per request (one in nginx, one
in fastcgi app), and associated tcp buffers. But all this isn’t
likely to be noticable given the number of resources you already
spent on request in question.

The only use-case which appears to be somewhat valid is
long-polling apps which consume almost no resources. But even
here you aren’t likely to save more than half resources.

may want to take a look if you are going to continue your
multiplexing work.

Thanks. Do you know where I can find those? Any hint to where I should
start googling to find it in the archives?

Here is initial post in russian mailing list:

http://nginx.org/pipermail/nginx-ru/2009-April/024101.html

Here is the update for the last patch:

http://nginx.org/pipermail/nginx-ru/2009-April/024379.html

Not sure patches still applies cleanly, I haven’t touch this for a
while.

Maxim D.

Rasmus_Andersson · December 19, 2009, 3:25pm

On Mon, Dec 14, 2009 at 02:12, Maxim D. [email protected] wrote:

GitHub - rsms/afcgi: Asynchronous/multiplexing FastCGI for nginx (incl. ref server implementation)

Please let me know what you think and if you’re interested in merging
in this code upstream.

[Just a side note: there are lots of whitespace damage and style
violations in your code. It’s belived that following original
nginx style is good idea, makes reading diffs easier and improves
karma.]

I’m well aware of this. The code (as you might have noticed) is in the
early stages. Cleanup etc will be done once I’ve got the general ideas
straight.

I wasn’t able to find enough code there to keep backend
connections alive and share them between requests. Looks like the
code in question just assigns unique ids to requests sent to
fastcgi backend - while each request is still send in separate
connection…

I’m far from experienced with the codebase of nginx thus I’m employing
trial-and-error at large. In my tests nginx do keep the upstream peer
connection and share it over http requests. In this screenshot
http://hunch.se/s/cx/e1rg8i8bkgwks.png in the upper right terminal you
can see nginx log in debug mode – no “disconnect” or “connect”
messages (which nginx do log when connecting/disconnecting to/from
upstream peers). In the left hand side of the screen you see two
FastCGI servers running. Nginx evenly distributes incoming requests to
the two backends over two persistent upstream peer connections (the
FastCGI server instances are logging “connect/disconnect” when nginx
creates or drops a connection).

I just re-ran the tests to confirm I wasn’t too tired yesterday.
Here’s the log from one FastCGI server:

fcgi client 127.0.0.1 connected on fd 5
…
app_handle_beginrequest 0x100530
…
app_handle_beginrequest 0x100530
…

Multiple requests received and handled over a period of time over one
persistent connection from nginx.

But I guess I’m missing something?

On the other hand, it sets FASTCGI_KEEP_CONN flag and thus breaks
things as nginx relay on fastcgi application to close connection.

The FastCGI server will still be responsible for closing the connection.

So basically it’s not something usefull right now. And it breaks
3 out of 4 fastcgi subtests in the test suite
(nginx-tests: log).

I wasn’t aware of those tests. In what way does it break the tests?
Would you please help me solve the issues in my code breaking these
tests?

As to idea in general - I’m not really sure that request
multiplexing is great feature of fastcgi protocol. It complicates
things and shouldn’t be noticably faster than just using multiple
connections to the same fastcgi app (assuming fastcgi app is able
to handle connections asynchroniously). It would be intresting to
compare though.

Multiplexing in FastCGI is a HUGE DEAL. Imagine you run a website
(there’s quite a few of those around) and you want to do something
fancy (for instance run some python or ruby app). Now, let’s say your
website become very popular and alot of your visitors have slow
connections. You also do stuff in your app which takes some time (may
it be long-polling for a chat message or wait for a slow I/O
operation).

This is a very common scenario now days (and the reason for nginx in
the first place – the c10k problem) which is very tough to satisfy
with non-multiplexing fastcgi setups.

Instead of this: http://hunch.se/s/ag/3h5dmvibcwk8k.png
You can have this: http://hunch.se/s/6b/jddgr2qk8wgg0.png

Saving gigabytes of memory and tens of thousands of file descriptors.
Today, the only option is to build purpose-made nginx-modules or whole
http-servers running separately from nginx.

I say this with real-world experience. Would be awesome to implement
the complete FastCGI 1.0 spec in nginx and be the first web server to
support long-lived and slow connections with rich web apps!

relay on connection close as end signal. I’ve posted some preview
patches a while ago which makes fastcgi module able to keep
connections alive (with ngx_http_upstream_keepalive module), you
may want to take a look if you are going to continue your
multiplexing work.

Thanks. Do you know where I can find those? Any hint to where I should
start googling to find it in the archives?

nginx-devel mailing list
[email protected]
nginx-devel Info Page

–
Rasmus Andersson

Rasmus_Andersson · December 19, 2009, 3:25pm

2009/12/14 Maxim D. [email protected]:

On the other hand, it sets FASTCGI_KEEP_CONN flag and thus breaks
things as nginx relay on fastcgi application to close connection.

The FastCGI server will still be responsible for closing the connection.

No. Once nginx sets FASTCGI_KEEP_CONN flag - it takes
this responsibility according to fastcgi spec.

Correct. My bad — when setting KEEP_CONN the app is relieved of
close() responsibility, but then at the same time, KEEP_CONN implies
the connection is not closed (it is keept open).

But looks like you didn’t take the actual problem: nginx needs
fastcgi application to close connections after it finished sending
response. It uses connection close as flush signal. Without this
requests will hang.

Then why does it work flawlessly for me? Are we maybe testing this
with different versions of nginx? My module is forked from the 0.8.29
module (and I’m testing on OS X 10.6 with the same version).

Sounds “broken” to rely on close as an implicit buffer flush signal.

In ngx_http_fastcgi_input_filter upstream_done is set to 1 on
NGX_HTTP_FASTCGI_END_REQUEST and any buffer is recycled. But then
looking at ngx_event_pipe.c it doesn’t seem like buffers are flushed.

I found your patches (linked in your message further down) which seem
to include similar features as my stuff. Did this ever get finished or
to a functional state?

Or are you saying nginx — at a very low level — does not have support
for persistent connections with upstream peers?

> > Difference between fastcgi multiplexing and multiple tcp > connections to the same fastcgi server isn't that huge. Note > well: same fastcgi server, not another one. > > You may save two file descriptors per request (one in nginx, one > in fastcgi app), and associated tcp buffers. But all this isn't > likely to be noticable given the number of resources you already > spent on request in question.

This solution would require you to have a nginx configuration which on
startup creates M number of connections to the backend(s), where M is
the maximum number of concurrent requests you will be able to handle.
I would like to set this to 10 000 (or even higher for some
applications), but that just seems like a REALLY ugly solution. Also,
at >10K the extra resources used are not neglectable – for each fcgi
connection between nginx and the app there will be:

• Buffers (as you mentioned)
• Metadata
• FD pair

So you would basically end up with a configuration w/ a fixed upper
limit of concurrent requests as well as a somewhat high starting point
when it comes to system resources used.

I would like to have a “pretty” and D.R.Y. setup where I only run one
FastCGI server process per CPU, connect each of those servers to the
http front-end (nginx) and pass around data and virtual requests.

The only use-case which appears to be somewhat valid is
long-polling apps which consume almost no resources. But even
here you aren’t likely to save more than half resources.

A more and more common case today, often solved by running a separate
server in parallel to the regular HTTP server. A solution where
maintenance, development and hardware becomes more expensive.

connections alive (with ngx_http_upstream_keepalive module), you
Here is the update for the last patch:

fastcgi performance at 10K

Not sure patches still applies cleanly, I haven’t touch this for a
while.

Thanks.

–
Rasmus Andersson