Collapsing Empty Variables in Nginx Access Log When Using TAB Delimiter

I’m noticing that Nginx’s log module is “collapsing” empty log variables
primarily when using a TAB (\t) as a delimiter in the access log.

For example, if I use the following:

log_format main
‘$server_name $msec $remote_addr $cookie_x $cookie_y $cookie_z
$status’;

And a request does not have any cookies set (so $cookie_x, $cookie_y,
and $cookie_z would be empty), then I would get a log line like this:

myserver.com 1331049809.478 10.10.15.50 - 200

I was expecting:

myserver.com 1331049809.478 10.10.15.50 - - - 200

The same thing happens with the $http_referer (when not set by the UA)
and $upstream_addr (when serving static files).

Here’s a link to my access log configuration:

I’m using Nginx v1.1.16.

Has anyone experienced this before?

Posted at Nginx Forum:

Hello!

On Tue, Mar 06, 2012 at 02:00:45PM -0500, adamchal wrote:

Nginx Collapsing Empty Variables in Access Log · GitHub

I’m using Nginx v1.1.16.

Has anyone experienced this before?

Number of tabs should match your log_format exactly, though some
variables may be logged as empty string (“”), not as a dash
(“-”), if they are set but empty. This may happen e.g. if
requests have empty cookies, i.e. “Cookie: x=; y=; z=” in request
headers.

This shouldn’t happen with $upstream_addr though, and if you are
sure you see this happening with $upstream_addr and/or there
missing tabs - you may want to debug this further.

Maxim D.

OK, I just knocked-out 50 push-ups as a punishment for this. There’s
absolutely nothing wrong with the logging module. Apparently, the Mac
OS X Term was messing this up while I was copying and pasting into
Textmate. I ran the logs through cut as well as a custom line parser
and it’s working perfectly fine. I apologize for the confusion, but
trust me that I actually blew a day and a half wrestling with this
before I figured-out that it was an issue with the copying and pasting
from Term.

On another note, it would be cool if we could set a default character
for NULL/empty variables. For example, instead of the ‘-’ character, I
would prefer just an empty char to be written since I’m doing a TSV
format. This way, I wouldn’t have to preprocess the logs and strip-out
the ‘-’ when importing to a DB store, etc.

Posted at Nginx Forum:

Yeah, it would be awesome if I could replace the ‘-’ with just an empty
string. Does anyone have a good idea of how to implement? The line
that currently sets the ‘-’ as the NULL character in the logs is:

src/http/modules/ngx_http_log_module.c:675

I thought about just writing a patch for it, but I’m not sure if that
would mess things up in the future. Even still, what’s the best way of
doing this?

Posted at Nginx Forum:

What I’ve done for now is replace the lines at 675 and 676 of
(src/http/modules/ngx_http_log_module.c):

674: if (value == NULL || value->not_found) {
675: *buf = ‘-’;
676: return buf + 1;
677: }

with:

674: if (value == NULL || value->not_found) {
675: *buf = 0;
676: return buf;
677: }

I hope this won’t cause any issues. It seems to be working very nicely.
Here’s a see one-liner if anyone is interested in doing the same
thing:

sed -n ‘1h;1!H;${;g;s/buf = \W-\W;(\s*return buf) + 1;/buf =
0;\1;/g;p;}’ src/http/modules/ngx_http_log_module.c >
src/http/modules/ngx_http_log_module.c.new && mv -f
src/http/modules/ngx_http_log_module.c.new
src/http/modules/ngx_http_log_module.c

It would be nice in the future to be able to set the “NULL” character
for the log module.

Posted at Nginx Forum: