Errorlog shows GET timeout; Accesslog shows POST

Hello!

Background

We’ve implemented a lightweight APIGW on nginx 1.9.9 using openresty
packages to customize the handling/proxying logic. We have dozens
clients
that are able to leverage this implementation just fine.

The Issue

We have one client that is experiencing some weird issues when trying to
access our APIGW. This is only for certain requests coming from their
app.
We strongly believe it’s a malformed request from their side but we’re
trying to help them find the issue. The behavior we see on our side is
quite strange as well. The error.log will show a GET request that timed
out. The access.log shows a POST request for the same request. Below
are
the two entries; the response time always seems to be around 5 seconds;
but
lowest timeout explicitly set in our configuration is 10 seconds.

Error log entry

2016/05/18 11:34:51 [info] 13516#0: *304290 client prematurely closed
connection, client: xx.xxx.xx.xxx, server: , request: “GET /hidden_uri
HTTP/1.1”, host: “blahblahblah.com

Access log entry

|LOCATION:GATEWAY|CN:NA|SSLPROTOCOL:TLSv1|SSLCIPHER:ECDHE-RSA-AES256-SHA|SERVICE:NA|VERSION:1|CLIENT:NA|BACKEND:|HTTPMETHOD:POST|ACCEPT:application/json;
v=4|OPERATION:NA|RESPONSETIME:5005|STATUS:400|SEVERITY:NA|STATUSCODE:NA|STATUSMESSAGE:NA|CLIENTIPADDRESS:xx.xxx.xx.xxx|CLIENTMESSAGEID:NA|MESSAGEID:|REQUESTBODYSIZE:0|RESPONSEBODYSIZE:0

Configuration for access log

‘$timestamp|LOCATION:$location_name|CN:$cn|SSLPROTOCOL:$ssl_protocol|SSLCIPHER:$ssl_cipher|SERVICE:$service_name|VERSION:$version|CLIENT:$upstream_api_key|BACKEND:$route_location|HTTPMETHOD:$upstream_method|ACCEPT:$accept|OPERATION:$operation|RESPONSETIME:$request_time_ms|STATUS:$status|SEVERITY:$severity|STATUSCODE:$status_code|STATUSMESSAGE:$status_message|CLIENTIPADDRESS:$clientip|CLIENTMESSAGEID:$clientmessageid|MESSAGEID:$messageid|REQUESTBODYSIZE:$content_length|RESPONSEBODYSIZE:$body_bytes_sent’;

Posted at Nginx Forum:

On Fri, May 20, 2016 at 03:45:15PM -0400, bclod wrote:

Hi there,

We’ve implemented a lightweight APIGW on nginx 1.9.9 using openresty
packages to customize the handling/proxying logic. We have dozens clients
that are able to leverage this implementation just fine.

I suspect that the details of the request/response, or at least the
details of what the request-data flow-response is supposed to be, will
be useful in finding and fixing the problem.

We have one client that is experiencing some weird issues when trying to
access our APIGW. This is only for certain requests coming from their app.
We strongly believe it’s a malformed request from their side but we’re
trying to help them find the issue.

Do you know what your side expects the request to be? (your API design.)

Can you see what request your side receives? (tcpdump/ssldump, or
perhaps
nginx-side extra logging.)

If they differ, that points at where the problem is.

The behavior we see on our side is
quite strange as well. The error.log will show a GET request that timed
out.

The error log entry shows that nginx thinks that the requesting client
closed the connection before nginx was ready to close the connection.

That might be “timed out”, or might be “error”.

The access.log shows a POST request for the same request. Below are
the two entries; the response time always seems to be around 5 seconds; but
lowest timeout explicitly set in our configuration is 10 seconds.

The access log seems very customised to your site; and includes
some variables that are not nginx-default. Perhaps they are
openresty-default? (Google doesn’t make it look like they obviously
are.)

You say “timeout explicitly set in our configuration”. If it was the
client that closed the connection because its idea of a timeout passed,
then your configuration is irrelevant unless it sets the client-side
timeout.

|LOCATION:GATEWAY|CN:NA|SSLPROTOCOL:TLSv1|SSLCIPHER:ECDHE-RSA-AES256-SHA|SERVICE:NA|VERSION:1|CLIENT:NA|BACKEND:|HTTPMETHOD:POST|ACCEPT:application/json;

HTTPMETHOD there is $upstream_method, which is not (by default) what
the client sent to nginx.

I think that there’s not enough information here to allow someone else
discover the problem. If you can make a reproducible failure case,
that will probably make it easier to discover.

Good luck with it,

f

Francis D. [email protected]

Francis,

You are totally right! I forgot that that we were using some crap logic
for
the variables; we are switching this to $request_method (the actual
built in
variable). Your notes confirm to me what I was thinking all along: that
the
client is likely closing the connection after 5 seconds. We are doing a
debug session today with tcpdump to get more details.

Thanks for setting me straight!

Posted at Nginx Forum: