Forum: NGINX Event-driven handler

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
535e2bd84829abaf90def03e89299e20?d=identicon&s=25 Marcus Clyne (Guest)
on 2009-02-18 00:43
(Received via mailing list)
Hi,

I'm trying to develop a handler for Nginx.  I want to do this:

(1) Request comes in to Nginx
(2) This is passed on to handler
(3) Handler sends request to other code (either a new thread or some
event-driven system) and tells Nginx to wait until output has been
generated
(4) Output is generated in 'background' whilst Nginx is handling other
requests
(5) Message is sent to Nginx that content has been generated through
some event
(e.g. with a semaphore)
(6) Nginx retrieves content and sends to client

I already have a handler that does 1,2, 4 (not in background though) and
6 -
it's the 3, 4 (in background) and 5 that I don't know how to do yet. My
current
implementation is a blocking one, which I don't want.

Could anyone please give me a simple example of how to do this, or at
least tell
me which functions I need to use?

Thanks.
B984299ceb40752b58146714eb192554?d=identicon&s=25 Ryan Dahl (Guest)
on 2009-02-18 01:01
(Received via mailing list)
> I'm trying to develop a handler for Nginx.  I want to do this:
>
> (1) Request comes in to Nginx
> (2) This is passed on to handler
> (3) Handler sends request to other code (either a new thread or some
> event-driven system) and tells Nginx to wait until output has been generated
> (4) Output is generated in 'background' whilst Nginx is handling other requests
> (5) Message is sent to Nginx that content has been generated through some event
> (e.g. with a semaphore)
> (6) Nginx retrieves content and sends to client


I think what your describing is exactly what the upstream modules do.
You shouldn't need to change Nginx.
535e2bd84829abaf90def03e89299e20?d=identicon&s=25 Marcus Clyne (Guest)
on 2009-02-18 01:29
(Received via mailing list)
Ryan Dahl <ry@...> writes:

> I think what your describing is exactly what the upstream modules do.
> You shouldn't need to change Nginx.

Ryan, thanks for replying. You're right to mention upstream modules, and
that is
indeed what they do.  However, I'm not passing off the request via a
socket -
I'm running the code inside Nginx itself (it's written in C).

I have the code running in a customized compiled Nginx binary, but
currently the
code is blocking (i.e. no other requests can be handled whilst my new
code is
being processed).  I'm looking to create a non-blocking version of my
code,
either through spawning off threads and handling each request in an
isolated
thread, or through some event-driven process.  I've not decided on that
part of
it yet, but I need to first get the code hooked into the event
architecture of
Nginx.

I think that the solution to what I'm doing will use some of the
functions that
the upstream handling module(s) use internally, I'm just not certain yet
which
ones - though I'm in the process of trying to work it out from looking
at the
source code.
B984299ceb40752b58146714eb192554?d=identicon&s=25 Ryan Dahl (Guest)
on 2009-02-18 02:00
(Received via mailing list)
> architecture of Nginx.
It sounds like you should make it into its own process and
communicating with it via an upstream module. You're not gaining
anything by compiling something like that into Nginx.
535e2bd84829abaf90def03e89299e20?d=identicon&s=25 Marcus Clyne (Guest)
on 2009-02-18 02:17
(Received via mailing list)
Ryan Dahl <ry@...> writes:

> > it yet, but I need to first get the code hooked into the event
> > architecture of Nginx.
>
> It sounds like you should make it into its own process and
> communicating with it via an upstream module. You're not gaining
> anything by compiling something like that into Nginx.
>
>

Maybe, but I'm not convinced that's correct. You are gaining, because
you don't
then have to deal with the overhead of opening and sending information
over
sockets.  I'm writing my code in C exactly because it's aimed towards
very
high-load servers, and I'd rather avoid any extra overhead if possible.

The architecture is there for it to work just fine within Nginx, I just
don't
know how yet.  Also, there's a lot of code withing Nginx that I'd have
to
reproduce in order to run it in its own binary.

If ultimately my code ends up destabilizing Nginx, then I would move it
out into
its own process, but I want to try to get it working without doing that
first.
Ff751c81227187a737dc2e102374e2a9?d=identicon&s=25 Dave Bailey (Guest)
on 2009-02-18 05:40
(Received via mailing list)
I have similar needs, and would also like to have my module install a
callback that gets executed once per second (like the "trigger"
callback for lighttpd modules).

I think src/event/ngx_event.h has some things that may be useful
(ngx_add_event, etc).

-dave
535e2bd84829abaf90def03e89299e20?d=identicon&s=25 Marcus Clyne (Guest)
on 2009-02-18 15:23
(Received via mailing list)
Hi Dave,

> I have similar needs, and would also like to have my module install a
> callback that gets executed once per second (like the "trigger"
> callback for lighttpd modules).

If you want to have a callback that executes regular code, you could
start a new
thread during the startup process.  One example might be to create a
post-configuration 'init' function, which is called after the
configurations
have been merged (in your ngx_http_<your-module>_module_ctx structure,
it would
be the second one on the list).

In the function that you call, spawn off a new thread, which itself has
some
kind of infinite loop.  You'd want to have either some kind of sleep
call in the
function (e.g. every 1 sec as you suggest), or perhaps use a semaphore
so that
your code was only woken up when needed.

Although I've not written it yet, this is what I'll probably be doing.

I'm not sure if this is what the 'trigger' callback does in Lighty,
because I'm
not familiar with the internal workings of it.

> I think src/event/ngx_event.h has some things that may be useful
> (ngx_add_event, etc).

Thanks. It's on my list of places to investigate the code. :-)

I'll let you know if I work out how to do it.

Cheers,

Marcus.
4e1ae4b836a9cfe3945d8c661b37246b?d=identicon&s=25 Manlio Perillo (Guest)
on 2009-02-18 17:11
(Received via mailing list)
Marcus Clyne ha scritto:
> Hi,
>
> I'm trying to develop a handler for Nginx.  I want to do this:
>
> (1) Request comes in to Nginx
> (2) This is passed on to handler
> (3) Handler sends request to other code (either a new thread or some
> event-driven system)

Be aware that Nginx does not support threads, and it is not thread safe.
Of course you can create and manage your threads, but that's not easy.

What do you mean by "some event-driven system"?

> and tells Nginx to wait until output has been generated

This is easy.
You just have to return NGX_DONE from the handler, and call
ngx_http_finalize_request when done.

You may be interested in my mod_wsgi code (ngx_http_wsgix_handler.c)
code for more details
http://hg.mperillo.ath.cx/nginx/mod_wsgi/file/tip/...

> (4) Output is generated in 'background' whilst Nginx is handling other requests

This, again, *may* be easy (if you don't use threads, of course).

> (5) Message is sent to Nginx that content has been generated through some event
> (e.g. with a semaphore)

No need for this, if you avoid threads.
And if you use threads, this is not that simple.

> (6) Nginx retrieves content and sends to client
>
> I already have a handler that does 1,2, 4 (not in background though) and 6 -
> it's the 3, 4 (in background) and 5 that I don't know how to do yet. My current
> implementation is a blocking one, which I don't want.
>
> Could anyone please give me a simple example of how to do this, or at least tell
> me which functions I need to use?
>

If you can refactor you code, I suggest to execute the computation by
steps.

Set a timer in Nginx, say to 0.001 seconds, so that at each event loop
cycle your code is executed.

The code should execute some piece of computation quickly (storing state
in an Nginx request context), and return.

Of course if the codes involves some IO, this may be tricky.
But you can use the Nginx event module to receive IO notifications.


> Thanks.
>


Regards  Manlio Perillo
535e2bd84829abaf90def03e89299e20?d=identicon&s=25 Eugaia (Guest)
on 2009-02-18 18:36
(Received via mailing list)
> Be aware that Nginx does not support threads, and it is not thread safe.
> Of course you can create and manage your threads, but that's not easy.
Thanks, yes I am aware - and I'll be handling the thread stuff myself.
> What do you mean by "some event-driven system"?
I'm talking about how I'll generate my content here - it's irrelevant to
my question about Nginx.
>> and tells Nginx to wait until output has been generated
>
> This is easy.
> You just have to return NGX_DONE from the handler, and call
> ngx_http_finalize_request when done.
This may be what I need to know.  If I return NGX_DONE, is the request
the response then not automatically
sent to the client?  In order to do that, you need to call
ngx_http_finalize_request. Is that right?

What about if I want to use filters after my content has been
generated?  E.g. gzip.  At the moment, my
code calls ngx_http_output_filter in my handler and returning the
response from that.

Should I instead do something like:

[my handler]
- sends request to my output-generating code (which may be in its own
new thread)
- immediately returns NGX_DONE (which then sends no response to the
client)

[my output-generating code]
- generates output (e.g. in its own thread)
- returns ngx_http_output_filter(...) (if wanting to continue using
filters)
- returns ngx_http_finalize_request(...) (if don't want to use any
filters)

Does this sound right, or have I mis-understood what you wrote?
> You may be interested in my mod_wsgi code (ngx_http_wsgix_handler.c)
> code for more details
> http://hg.mperillo.ath.cx/nginx/mod_wsgi/file/tip/...
>
Thanks, I'll check it out.
>> (4) Output is generated in 'background' whilst Nginx is handling
>> other requests
>
> This, again, *may* be easy (if you don't use threads, of course).
>
>> (5) Message is sent to Nginx that content has been generated through
>> some event
>> (e.g. with a semaphore)
If I've understood your comments above, then this notification actually
isn't necessary -
you just call ngx_http_finalize_request (or ngx_http_output_filter?)
when you're done.

I was thinking that a response was automatically sent when you returned
from your handler.
> If you can refactor you code, I suggest to execute the computation by
> steps.
That's not really practical for what I'm doing - there's no guarantee
that the code can be executed quickly.
> Of course if the codes involves some IO, this may be tricky.
> But you can use the Nginx event module to receive IO notifications.
Could you possibly give me an example?

Thanks,

Marcus.
4e1ae4b836a9cfe3945d8c661b37246b?d=identicon&s=25 Manlio Perillo (Guest)
on 2009-02-18 19:48
(Received via mailing list)
Eugaia ha scritto:
>
> [...]
>
>> You just have to return NGX_DONE from the handler, and call
>> ngx_http_finalize_request when done.
> This may be what I need to know.  If I return NGX_DONE, is the request
> the response then not automatically
> sent to the client?

Nginx handles response body in chunks.
Each time you call ngx_http_output_filter, your chain buffer is sent to
Nginx filters and *may* be sent to OS socket buffer.

> In order to do that, you need to call
> ngx_http_finalize_request. Is that right?
>

Calling nxg_http_finalize_request simply, as the name suggest, instruct
Nginx to finalize the current request.

> What about if I want to use filters after my content has been
> generated?  E.g. gzip.  At the moment, my
> code calls ngx_http_output_filter in my handler and returning the
> response from that.
>

See my previous response.
You need to call ngx_http_output_filter as usual, but there are some
things you have to check:

1) If an error is returned, then you need to call
    ngx_http_finalize_request, with that the error code
2) If NGX_AGAIN is returned, then you need to setup the code
    so that your handler is called again when Nginx know that the socket
    buffer is empty again.

    This is one of the most "complex" parts.
    Again, the source code from mod_wsgi may help
    (although I can not guarantee it is correct).

> Should I instead do something like:
>
> [my handler]
> - sends request to my output-generating code (which may be in its own
> new thread)

   ok

> - immediately returns NGX_DONE (which then sends no response to the client)
>

   ok

> [my output-generating code]
> - generates output (e.g. in its own thread)

   ok

> - returns ngx_http_output_filter(...) (if wanting to continue using
> filters)
> - returns ngx_http_finalize_request(...) (if don't want to use any filters)
>

   WARNING!

   As I have said, Nginx *is not* thread safe.
   You MUST not call ngx_http_output_filter, ngx_http_finalize_request,
   or any other Nginx function (with some exceptions) from your thread.

> [...]
>>> (5) Message is sent to Nginx that content has been generated through
>>> some event
>>> (e.g. with a semaphore)
> If I've understood your comments above, then this notification actually
> isn't necessary -
> you just call ngx_http_finalize_request (or ngx_http_output_filter?)
> when you're done.
>

No, don't do that if you use a separate thread.

> I was thinking that a response was automatically sent when you returned
> from your handler.

No.
When you call ngx_http_output_filter, content may be sent to the client.
But control *must* return to Nginx as soon as possible (Nginx uses the
so called cooperative multitasking).

>> If you can refactor you code, I suggest to execute the computation by
>> steps.
> That's not really practical for what I'm doing - there's no guarantee
> that the code can be executed quickly.

What do you need to do?

>> Of course if the codes involves some IO, this may be tricky.
>> But you can use the Nginx event module to receive IO notifications.
> Could you possibly give me an example?
>

Again, see the code of mod_wsgi:
http://hg.mperillo.ath.cx/nginx/mod_wsgi/file/tip/...

function State_register

As I have anticipated, this is rather complex.
You should also read Nginx source code to understand how things works.

> Thanks,
>
> Marcus.
>
>


Manlio
535e2bd84829abaf90def03e89299e20?d=identicon&s=25 Marcus Clyne (Guest)
on 2009-02-18 20:49
(Received via mailing list)
> What do you need to do?

I'm generating pages based on simple templates, but which could do all
manner of things, including getting data from databases. I don't want to
use PHP/Python etc because the overhead would be very high and not very
practical for what I'm doing.

 > You should also read Nginx source code to understand how things
works.

Yep, been doing that. :-)

Ok, I understand that I shouldn't call ngx_http_output_filter etc from
within my thread.

You mentioned setting up a loop that checked every 0.001 secs.  How
about a solution like this:

[handler] (as mentioned before)
- passes off request to content generator
- returns immediately to Nginx with NGX_DONE

[content generator]
- generates content
- in threadsafe fashion, puts the output content in a queue of responses
to be returned

[in Nginx]
- loop that checks every e.g. 0.001 secs to see if there are any
responses (headers and/or content)
- checks for responses in threadsafe fashion, and if there are, calls
ngx_http_output_filter etc

This last loop is integrated into the co-operative multi-tasking
architecture of Nginx, so we don't need
to worry about calls to ngx_http_output_filter not being threadsafe.

Is it possible to set up such a loop in Nginx?  How would I set it up?

Alternatively, I would prefer to have an event-driven system whereby I
wouldn't need a loop to check whether
my responses were ready.  Is it possible to trigger non-IO events in
Nginx, such that once triggered, Nginx
would handle them properly (even if they were called from within a
different thread)?


I've looked at your WSGI code a bit as well as the Nginx source.  I will
be looking at it some more too.

Thanks,

Marcus.
B984299ceb40752b58146714eb192554?d=identicon&s=25 Ryan Dahl (Guest)
on 2009-02-18 21:44
(Received via mailing list)
> [handler] (as mentioned before)
> - passes off request to content generator
> - returns immediately to Nginx with NGX_DONE
>
> [content generator]
> - generates content
> - in threadsafe fashion, puts the output content in a queue of responses to
> be returned

what is the content generator? are you writing it from scratch? if it
connects to external sockets and doesn't use nginx's event loop or is
blocking  - it should be an external process and you should use an
upstream module. if you are writing from scratch (that is all of the
socket code too) you are possibly not on the wrong path (unlikely)

> [in Nginx]
> - loop that checks every e.g. 0.001 secs to see if there are any responses

this is going to fail spectacularly.

> (headers and/or content)
> - checks for responses in threadsafe fashion, and if there are, calls
> ngx_http_output_filter etc

You should be talking via pipes/sockets to your content generator and
getting callbacks on those via the event loop.    And if you're
talking to another thread via pipes already you might as well save the
trouble and move it into its own process and use the already written
mature upstream modules. Just compiling a bunch of random code into
Nginx is not going to make it fast.  You either design it from the
ground up very carefully or use external processes.
535e2bd84829abaf90def03e89299e20?d=identicon&s=25 Marcus Clyne (Guest)
on 2009-02-18 22:17
(Received via mailing list)
>> [in Nginx]
>> - loop that checks every e.g. 0.001 secs to see if there are any responses
>>
>
> this is going to fail spectacularly.
>
>
This was my gut instinct (it sounds a horrible idea) and is something I
wanted to avoid at all costs.
It was more of a desperation attempt to not put things in an external
process.

> what is the content generator? are you writing it from scratch? if it
> connects to external sockets and doesn't use nginx's event loop or is
> blocking  - it should be an external process and you should use an
> upstream module. if you are writing from scratch (that is all of the
> socket code too) you are possibly not on the wrong path (unlikely)
>
Yes, as my understanding of the internals of Nginx are increasing, I
think I agree with you
on this one. I am writing the content generator from scratch.
> You should be talking via pipes/sockets to your content generator and
> getting callbacks on those via the event loop.    And if you're
> talking to another thread via pipes already you might as well save the
> trouble and move it into its own process and use the already written
> mature upstream modules. Just compiling a bunch of random code into
> Nginx is not going to make it fast.  You either design it from the
> ground up very carefully or use external processes.
>
>
I think I've come round to the idea that you're right on this one.  I
think what I'm doing
will be too messy to try to get it working inside Nginx, even though I'd
prefer to do it that
way. I'm looking at both FastCGI and a customized format which I'd use
with the upstream
module as possible solutions.
4e1ae4b836a9cfe3945d8c661b37246b?d=identicon&s=25 Manlio Perillo (Guest)
on 2009-02-19 12:41
(Received via mailing list)
Marcus Clyne ha scritto:
>
>  > What do you need to do?
>
> I'm generating pages based on simple templates, but which could do all
> manner of things, including getting data from databases. I don't want to
> use PHP/Python etc because the overhead would be very high and not very
> practical for what I'm doing.
>

This is an incorrect generalization!

If your content generation is computationally expensive (both CPU and IO
bound), then the overhead of the Python interpreter is usually much
smaller compared to the overhead of your computation; and you can always
code your computationally expensive routine as a C extension.

However, for what you want to do, Nginx is the wrong choice.
You should use something else like:

1) CGI, if your computation is CPU bound, and you don't need a
persistent connection to the database
2) Apache, if you want flexibility
3) Erlang, if you want a complete environment for writing both CPU and
IO bound scalable web applications



> - generates content
> - in threadsafe fashion, puts the output content in a queue of responses
> to be returned
>
> [in Nginx]
> - loop that checks every e.g. 0.001 secs to see if there are any
> responses (headers and/or content)
> - checks for responses in threadsafe fashion, and if there are, calls
> ngx_http_output_filter etc
>

This is called polling, and is generally the wrong solution.

> [...]


> Alternatively, I would prefer to have an event-driven system whereby I
> wouldn't need a loop to check whether
> my responses were ready.

Note that you can't have a second "loop" in Nginx (unless this is in a
separate thread).


> Is it possible to trigger non-IO events in
> Nginx, such that once triggered, Nginx
> would handle them properly (even if they were called from within a
> different thread)?
>

The Nginx event loop is event based.
It uses epoll or kqueue.

With kqueue or epoll it is possible to receive notification for non IO
events, like signals or filesystem events.

But in your case, you can use a simple pipe.
In the mailing list archive you can find an old discussione between me
and Igor, about this topic.

 > [...]



Manlio
535e2bd84829abaf90def03e89299e20?d=identicon&s=25 Marcus Clyne (Guest)
on 2009-02-19 14:13
(Received via mailing list)
> However, for what you want to do, Nginx is the wrong choice.
> You should use something else like:
>
> 1) CGI, if your computation is CPU bound, and you don't need a
> persistent connection to the database
> 2) Apache, if you want flexibility
> 3) Erlang, if you want a complete environment for writing both CPU and
> IO bound scalable web applications
I'm looking into using libevent's HTTP interface. It will be much easier
to plug my threaded solution into that.
> This is called polling, and is generally the wrong solution.
Yes, and I agree. ;-)
> With kqueue or epoll it is possible to receive notification for non IO
> events, like signals or filesystem events.
I'm not very fimiliar with these yet - though I'm doing reading on it
now (I'll probably use libevent's functions).
> But in your case, you can use a simple pipe.
> In the mailing list archive you can find an old discussione between me
> and Igor, about this topic.
>
I'll check these out.

Thanks for your help.
This topic is locked and can not be replied to.