Filtering out non-HTTP requests (400 errors)

gnosek · January 11, 2010, 5:29pm

Hi all,

I’m having a silly problem I’d rather ask about before spending hours.

I have a bunch of Nginxes serving a webservice behind a haproxy, whose
TCP checks (quite naturally) cause lots of 400 errors in access.log and
I’d like to filter them out. However, I cannot simply drop e.g. requests
without a Host: header as the header is universally ignored both by the
clients and the servers, so I can’t really tell what’s in it (and
frankly I don’t care too much as there are no vhosts there).

So, is there a simple way to filter out non-HTTP requests from the
access log? http://wiki.nginx.org/HWLoadbalancerCheckErrors doesn’t seem
to work unfortunately (I tried something along this way already and
checked this exact config now).

The machines are running Nginx 0.6.39-2.el5 (straight from EPEL) and I’d
rather keep it that way (otherwise I’d probably have hacked it out of
the
source by now ;))

Best regards,
Grzegorz N.

gnosek · January 11, 2010, 5:53pm

Hi Grzegorz,
this isn’t the answer you were looking for, but why don’t you simply use
“HTTP health-checks” (HAProxy: option httpchk
) instead of useless “TCP health-checks”?

Best regards,
Piotr S. < [email protected] >

gnosek · January 11, 2010, 6:23pm

On Mon, Jan 11, 2010 at 8:28 AM, Grzegorz N.
[email protected] wrote:

So, is there a simple way to filter out non-HTTP requests from the
access log? http://wiki.nginx.org/HWLoadbalancerCheckErrors doesn’t seem
to work unfortunately (I tried something along this way already and
checked this exact config now).

I have this as well, but from some hardware load balancer our provider
uses.

I want to have it basically say:

if ($remote_addr ~* 10.2.34.3) {
access_log off;
}

Or something of the nature. Note: that is pseudocode, I probably
messed up the ~* but I just woke up. It was an example

I think I tried this and it didn’t work (and I’m sure Grzegorz tried it
too)

gnosek · January 11, 2010, 8:47pm

On Mon, Jan 11, 2010 at 9:22 AM, Michael S. [email protected]
wrote:

I want to have it basically say:

nginx mailing list
[email protected]
nginx Info Page

Not many things go well inside if aside from things defined in the
same module. The only way without hacking code that I can think to do
this is with some internal servers that you proxy to conditionally
based on IP address. It’s dirty as hell, though XD – example below
(untested so consider psuedoconfig):

upstream good {
server 127.0.0.1:1234;
}

upstream bad {
server 127.0.0.1:1235;
}

server {
listen 80;
server_name main_server;
server_name_in_redirect off;
access_log off;
set $backend good;
if ($remote_addr ~* 10.2.34.3) {
set $backend bad;
}
location / {
proxy_pass http://$backend;
}
}

server {
listen 127.0.0.1:1234;
server_name good;
… good config …
}

server {
listen 127.0.0.1:1235;
server_name bad;
… bad config …
}

– Merlin

gnosek · January 12, 2010, 10:14am

On Mon, Jan 11, 2010 at 05:52:29PM +0100, Piotr S. wrote:

Hi Grzegorz,
this isn’t the answer you were looking for, but why don’t you simply use
“HTTP health-checks” (HAProxy: option httpchk
) instead of useless “TCP health-checks”?

Basically due to development by copy-paste (we simply extended the
haproxy config by another port). I guess moving to http checks is
the way to go in the long run. OTOH, I’m curious about what will people
come up with

Best regards,
Grzegorz N.

gnosek · January 12, 2010, 5:13pm

I guess moving to http checks is the way to go in the long run.

Yes it is. IMHO “TCP checks” give you false sense of safety, because you
think that everything works fine, while they only verify that connection
is
accepted (on the TCP level) by the server… Proper checks must issue
command on application-level and parse server’s response, otherwise they
are
useless.

BTW, I tried messing with $request and $request_uri but without success
(I expected either an empty string or a dash, as in access.log).

Dash in logs means that given value wasn’t present (v->not_found = 1).
Also,
since we are talking about 400 Bad Request, you can’t really expect
$request_uri to be set

Best regards,
Piotr S. < [email protected] >

gnosek · January 12, 2010, 10:19am

On Mon, Jan 11, 2010 at 09:22:32AM -0800, Michael S. wrote:

I want to have it basically say:

if ($remote_addr ~* 10.2.34.3) {
access_log off;
}

Or something of the nature. Note: that is pseudocode, I probably
messed up the ~* but I just woke up. It was an example

I think I tried this and it didn’t work (and I’m sure Grzegorz tried it too)

Actually I didn’t because all the clients are running on the same
machines as the haproxy servers. haproxy is used mostly for HA and
abstracting the configuration somewhat (every machine has its private
haproxy and clients there connect to 127.0.0.1). So I guess I can’t use
any techniques that usually accompany load balancer detection due to
100% false positive rate.

BTW, I tried messing with $request and $request_uri but without success
(I expected either an empty string or a dash, as in access.log).

Best regards,
Grzegorz N.