Post_action, just send http request, not fcgi

I’m still battling with my post_action handler. I’m having a painful
time
trying to use an FCGI handler, mostly because of the perl implementation
of
FCGI doesn’t work nicely with our daemon framework which is based on
Net::Server.

I could keep digging down that path, but I decided a MUCH easier way
would
be if post_action could just send an HTTP request to my handler, and put
the
values I want in the header, it’ll be easy to pull the stuff from the
headers.

I thought something hacky like this might work.

location = @done {
  set    $rateuser     $upstream_http_x_rate_user;
  proxy_set_header RateUser $rateuser;
  proxy_set_header RateURI  $request_uri;
  proxy_set_header RateBytes $body_bytes_sent;
  proxy_pass http://127.0.0.1:2350;
}

And indeed for GET requests it does nicely, I get the headers I want
which I
can quickly and easily decode them. In fact I don’t really need to set
RateURI, since the first first line of the request gives me the URI.

If I do a POST though, nginx isn’t happy.

root@robmlinux:/home/mod_perl/hm# 2008/03/07 13:20:46 [warn] 23198#0:
*352 a
client request body is buffered to a temporary file
/var/accelcache/0/00/0000000000, client: 192.168.110.1, server: xyz,
request: “POST /testdir/ HTTP/1.1”, host: “www.testmachine.com”,
referrer:
http://www.testmachine.com/testdir/
2008/03/07 13:20:52 [crit] 23198#0: *352 pread() failed, file
“/var/accelcache/0/00/0000000000” (9: Bad file descriptor) while sending
request to upstream, client: 192.168.110.1, server: xyz, request: “POST
/testdir/ HTTP/1.1”, upstream: “http://127.0.0.1:2350/testdir/”, host:
www.testmachine.com”, referrer: “http://www.testmachine.com/testdir/
2008/03/07 13:20:52 [crit] 23198#0: *352 pread() failed, file
“/var/accelcache/0/00/0000000000” (9: Bad file descriptor) while sending
request to upstream, client: 192.168.110.1, server: xyz, request: “POST
/testdir/ HTTP/1.1”, upstream: “http://127.0.0.1:2350/testdir/”, host:
www.testmachine.com”, referrer: “http://www.testmachine.com/testdir/
2008/03/07 13:20:52 [crit] 23198#0: *352 pread() failed, file
“/var/accelcache/0/00/0000000000” (9: Bad file descriptor) while sending
request to upstream, client: 192.168.110.1, server: xyz, request: “POST
/testdir/ HTTP/1.1”, upstream: “http://127.0.0.1:2350/testdir/”, host:
www.testmachine.com”, referrer: “http://www.testmachine.com/testdir/

Using “proxy_pass” is clearly the wrong thing here because I don’t want
to
actually proxy the data again, it was just a hack to try. Really what I
want
is a simple way to say “just send a GET request to this server with
these
headers and just ignore the result, since there shouldn’t be any”. Is
there
any way to do that so I can skip FCGI totally in the post_action?

Rob

And indeed for GET requests it does nicely, I get the headers I want which
I can quickly and easily decode them. In fact I don’t really need to set
RateURI, since the first first line of the request gives me the URI.

If I do a POST though, nginx isn’t happy.

Playing around some more, I found I could do this:

location = @done {
  set    $rateuser     $upstream_http_x_rate_user;
  proxy_set_header RateUser $rateuser;
  proxy_set_header RateURI  $request_uri;
  proxy_set_header RateBytes $body_bytes_sent;
  proxy_pass_request_body off;
  proxy_pass_request_headers off;
  proxy_pass http://unix:/var/state/ratetrack/ratepostaction:/;
}

Adding the “proxy_pass_request_body off” makes everything work nicely,
and I
added the “proxy_pass_request_headers off” because I don’t really need
that
information, so might as well not send it. So this all seems to work
nicely.

So is this the best/right way of doing this? I mean it currently seems
to
work, and to do what I want. However I realise that I’m abusing the
“proxy”
system for something it wasn’t designed to do, so I don’t know if this
will
break with future or if there is a better way of doing this?

Also what actually happens to the content that proxy_pass returns in
this
case. I really want it to just “disappear”, and for the moment it does
seem
to do that, but is that guaranteed?

However on top of that, I’m also worried about something else as well. I
did
some testing where I setup the following situation.

  1. Only 1 nginx worker process
  2. Many backend post_action “http done handler” processes
  3. The “http done handler” would sleep 30 seconds before returning the
    HTTP
    response

I wanted to test the case if the “http done handler” code I wrote got
slow
for some reason, that it didn’t affect all of nginx.

What I found was that when I did the first request it worked, but if I
did a
second request in < 30 seconds, it would block (browser just spinning)
for
the 30 seconds to expire. It seems anything that occurs in the
post_action
handler blocks any new connections from being processed. And as
mentioned,
it’s not because there weren’t any other free “http done handler”
processes.

I repeated the process, and straced nginx to see what was happening.

about to do a request

epoll_wait(11, {{EPOLLIN, {u32=3081252872, u64=3081252872}}}, 512, -1) =
1
gettimeofday({1204859394, 326281}, NULL) = 0
accept(5, {sa_family=AF_INET, sin_port=htons(2934),
sin_addr=inet_addr(“192.168.110.1”)}, [16]) = 3

all the proxying to/from the backend to the client …

connect(4, {sa_family=AF_FILE,
path="/var/state/ratetrack/ratepostaction"},
110) = 0
getsockopt(4, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
writev(4, [{“GET / HTTP/1.0\r\nRateUser: testuser\r\nRateURI:
/testdir/\r\nRateBytes: 739\r\nHost: localhost\r\nConnection:
close\r\n\r\n”, xyz}], 1) = xyz
epoll_wait(11, {{EPOLLOUT, {u32=3081253209, u64=13233881762136338777}}},
512, 60000) = 1
gettimeofday({1204859394, 498342}, NULL) = 0
epoll_wait(11,

got here and would have waited 30 seconds for backend to respond,

but
before that happened I did another web request with the browser to the
same
uri…

{{EPOLLIN|EPOLLOUT, {u32=3081253125, u64=13831246206168740101}}}, 512,
59972) = 1
gettimeofday({1204859397, 533816}, NULL) = 0
epoll_wait(11,

In no way attempts to handle the new request. Just keeps waiting for

the
backend post_action handler to respond and finish, only then does it
handle
the new request

This seems a bug to me. A post_action handler shouldn’t be able to block
the
handling of new connections to nginx.

Rob

On Fri, Mar 07, 2008 at 02:17:37PM +1100, Rob M. wrote:

And indeed for GET requests it does nicely, I get the headers I want which
proxy_set_header RateURI $request_uri;

So is this the best/right way of doing this? I mean it currently seems to
work, and to do what I want. However I realise that I’m abusing the “proxy”
system for something it wasn’t designed to do, so I don’t know if this will
break with future or if there is a better way of doing this?

“proxy_pass_request_body off” is right way to do this.
As well as proxy_pass_request_headers, it was created to include banners
via SSI.

Also what actually happens to the content that proxy_pass returns in this
case. I really want it to just “disappear”, and for the moment it does seem
to do that, but is that guaranteed?

Yes, nginx does not send post_action content to a client.

writev(4, [{"GET / HTTP/1.0\r\nRateUser: testuser\r\nRateURI:

the handling of new connections to nginx.
post_action does not block new connections, but it blocks current
connection.
nginx handles post_action in context of request and connection, so it
does not close connection to a client before going to post_action.
And “keepalive_timeout 0” will not help.

If you run another browser you are will get response immidiately.

But rather than actually keeping the connection open, it immediately
closes it. This seems to be annoying some clients.

Again, if you remove the post_action call, it does keep the connection
open fine.

Not being an expert on nginx code, for now, I’ve added the patch below
to my
nginx install. Basically it just stops the post_action handler being
called
for anything but 200 or 206 responses, which is a bit evil, but seems to
fix
the disconnct issue. Hmmm, I didn’t really look into this very hard
though,
it might be causing a memory leak as well, I think I need to double
check
that…

If you have a moment Igor, some idea of what the proper fix for this
would
be would be appreciated.

Rob


— nginx-0.5.35.orig/src/http/ngx_http_request.c 2008-03-18
01:39:29.000000000 +0000
+++ nginx-0.5.35/src/http/ngx_http_request.c 2008-03-18
01:40:28.000000000 +0000
@@ -1710,7 +1710,7 @@ ngx_http_finalize_request(ngx_http_reque
r->request_complete = 1;
}

  • if (ngx_http_post_action® == NGX_OK) {
  • if ((r->headers_out.status == NGX_HTTP_OK || r->headers_out.status
    ==
    NGX_HTTP_PARTIAL_CONTENT) && ngx_http_post_action® == NGX_OK) {
    return;
    }

Hi Igor

Also what actually happens to the content that proxy_pass returns in this
case. I really want it to just “disappear”, and for the moment it does
seem
to do that, but is that guaranteed?

Yes, nginx does not send post_action content to a client.

I’ve found two issues with using this setup, both of which are rather
annoying. As a reminder, here’s the setup.


location / {
  proxy_pass                  http://backend/dav/;
  proxy_intercept_errors      off;
  proxy_next_upstream         off;


post_action @ratepostaction;
}

location @ratepostaction {


proxy_pass_request_body off;
proxy_pass_request_headers off;
proxy_pass http://unix:/var/state/ratetrack/ratepostaction:;
}

Problem 1

If the upstream returns an error (eg 404 not found response), then the
access log seems to log the result of the @ratepostaction proxy
response,
rather than the 404 response

For example, here’s what a log line looks like if the backend returns a
404
response.

127.0.0.2 [18/Mar/2008:10:18:57 +1100]
dav.messagingengine.com/Contents
200 489 dav.messagingengine.com “-” “WebDAVFS/1.4.1 (01418000)
Darwin/8.11.1
(i386)” TIME=0.607 GZIP=-

If I comment out the post_action line above and rerun the same request:

127.0.0.2 [18/Mar/2008:10:20:05 +1100]
dav.messagingengine.com/Contents
404 489 dav.messagingengine.com “-” “WebDAVFS/1.4.1 (01418000)
Darwin/8.11.1
(i386)” TIME=0.049 GZIP=-

Problem 2

If the upstream returns an error (eg 404), then nginx returns that
message
to the client as follows:


HTTP/1.1 404 Not Found
Server: nginx/0.5.35
Date: Mon, 17 Mar 2008 22:48:13 GMT
Content-Type: text/html; charset=iso-8859-1
Transfer-Encoding: chunked
Connection: keep-alive

11e

404 Not Found

Not Found

The requested URL /dav/Contents was not found on this server.


Apache/1.3.34 Server at dav.messagingengine.com Port 80

0


But rather than actually keeping the connection open, it immediately
closes
it. This seems to be annoying some clients.

Again, if you remove the post_action call, it does keep the connection
open
fine.

Rob