Strange behavior with custom module and http proxy

Hi all,

I have a module I’ve written which, as part of its processing
fork/execs another process. The child process forks again and exits
and the grandchild process is reparented to init and stays active for
60 seconds or longer depending.

This works just fine from the browser, scripts, etc… The problem
comes when I front end the module with the http proxy module for
caching. On the first (non-cached) call the request will not return
until the grandchild process dies (generally 60 seconds).

I’m at a loss as to what is causing this behavior as the grandchild as
mentioned had been reparented to init and the child process exited and
was reaped by successfully by nginx. I’ve put some logging statements
in ngx_http_finalize_request() and other places and it seems like my
module is finishing immediately, and the delay occurs somewhere in the
proxy module. I was hoping for some pointers as to architecturally
what might be happening or where might be good places to dig further.

Thanks,

Brian

Hi Brian,
it’s hard to tell what goes wrong just from your description, but could
you
explain what’s the reason behind double-fork() & reparenting child
process?

I’ve recently made a module (to be released soon) which also fork()s
and I
didn’t notice such behavior.

Best regards,
Piotr S. < [email protected] >

Hi Piotr,

The (grand)child process does some things asynchronously from the
module and I’m basically using the double fork to know when the
grandchild is up and running (when execv() returns, I know its
initialized and running). I then make a unix socket connection to it
and pass some data back and forth. The (grand)child SIGALRM’s and
exits after a timeout period (60 seconds) unless it’s reattached to
and reused. Curiously, if I bring up the child manually (nginx module
not doing the fork/exec) then everything works fine. Also if a second
nginx worker attaches subsequently, it works. It is only the act of
the fork which seems to invoke the problem. It’s almost as if nginx
is doing a wait() for the child before closing the request, although I
know the child has been reaped earlier and the grandchild is now a
child of init.

Like I said, I only noticed the behavior when using http proxy
caching, so the problem has existed for me all along without my being
aware of it.

Brian

2009/12/21 Piotr S. [email protected]:

It’s almost as if nginx
is doing a wait() for the child before closing the request, although I
know the child has been reaped earlier and the grandchild is now a
child of init.

It is indeed calling waitpid() after receiving SIGCHLD, but this
shouldn’t
be a problem. Could you check if this happens also with single-fork()
(without spawning grandchild)?

Like I said, I only noticed the behavior when using http proxy
caching, so the problem has existed for me all along without my being
aware of it.

If you don’t want to share your module, could you make module with
minimal
functionality which reproduces this behavior?

Best regards,
Piotr S. < [email protected] >

Hi Piotr,

I have created a (very) small module that demonstrates the problem.
It can be downloaded here:

http://www.mediafire.com/?yij1tmyjzyz

I’ve included a README on how to set up the module, and how to
reproduce the problem.

So, if I open the URL http://localhost:4242/bogus/a it returns right
away.

If I open http://localhost:4242/cache/bogus/b which goes through the
http proxy caching it hangs for 30 seconds until the child process
(bogochild) exits after sleeping.

Also, I did test not forking in the child (which basically makes the
child a “sleep 30”) with a small adjustment to the module (removing
the waitpid) and the problem still persists.

Thanks again for the help!

Brian

2009/12/21 Brian B. [email protected]:

Hi Brian,

I have created a (very) small module that demonstrates the problem.
It can be downloaded here:

http://www.mediafire.com/?yij1tmyjzyz

Few notes:

  1. This problem isn’t related to caching at all, simple “proxy_pass”
    also produces this behavior.
  2. I believe that execv() is source of this problem, not fork().

Anyway, it seems that calling execv() with open sockets copies all file
descriptors to new process and nginx is expecting some action to happen
on them. Since there is no action, nginx waits until your process exits
(and OS closes all its file descriptors for you).

I’ve attached simple patch, which fixes this problem.

There are few things to consider:

  1. This works well with kevent (*BSD), I’m not sure what will be the
    effect of closing sockets registered with epoll (Linux), because when I
    was developing ngx_slowfs_cache, nginx crashed when listening sockets
    were closed after fork() under Linux.
  2. This works as expected for single request. Under high load you might
    need to close sockets for all requests… But I’m not really sure
    about this and it’s up to you to test this.

Best regards,
Piotr S. < [email protected] >

That is the confusing part to me. SIGCHLD is called on the death of
the parent but not the death of the grandchild (correctly) and the
signal handler with waitpid() is called immediately, not after the 60
second delay.

The module is destined to be released but I’m not quite ready for it
just yet. I will roll up a minimal module that replicates the
behavior and post that.

Thank you for the help.

Brian

2009/12/21 Piotr S. [email protected]:

Hi Piotr,

That makes perfect sense now that I think about it. The patch worked
perfectly.

Thank you for all your help!

Brian

2009/12/22 Piotr S. [email protected]: