Nginx Log Parser

Hello Everyone,

First pardon me if this have already been discussed in the russian
Second, thank you igor for this wonderful software, it’s amazing speed
and stability.

After googling and searching around I’ve not found a single page with
information related to take advantage of nginx cache logging and tools
to parse this data. So decided to create this topic to first check
what’s everyone else using and then if there’s nothing ready for
production, do it myself and share it here :-).

  1. Common basic web stats

By default nginx combined format is identical to apache, so out of the
box anyone can generate graphs of nginx traffic by using any software
that supports apache: Awstats, Webalizer, etc. This is not a problem in
nginx, setup of these softwares for nginx have not a single difference
with setup for apache logs.

  1. Web stats of cache usage

It’s well documented on the wiki (
Module ngx_http_upstream_module ) how to add to the log
useful information regarding caching:
upstream_cache_status (hit,miss,expired,stale,updating,etc)
upstream_status (status returned by apache)
upstream_response_time (time taken by backend to process the request)

However, I’ve been completely unable to find a single reference to
anyone having this information processed nicely by log analyzers
allowing administrators to see nginx cache stats.

What are other people using for this? I could not find any hack around
to show this info in awstats, webalizer or visitors(hping). The only
thing i found is a script made by a guy named Eliot from saltycrane
using python to export log files into mongodb and then use
orbited+stomp+jquery to plot that data into graphs.
(You can see a presentation about it here:
Live Log Analyzer )

  1. Live status of nginx for sever monitoring/alert software like

Another thing I’m looking for it’s a way to get live data from nginx to
pipe it into my monitoring software (zabbix).
I’m right now using 0.7.67 so I can’t use the new status module, however
AFAIK the status module do not provide information per virtualhost but
nginx-wide stats so even by parsing that status I won’t be able to find
the culprit domain for a traffic peak until I parse that domain log
offline with another software.

One way to do this would be to parse log files each minute parsing only
last 60 seconds of activity, however I’m afraid this may result in the
solution being worst than the problem as it will generate load in the
server each minute. Have anyone done something like this already for
nginx cache logs?

Well… Hope this would be a very long and productive topic :slight_smile:

Best Regards from Uruguay

Guzmán Brasó

Posted at Nginx Forum:

I implemented cache stats in Zabbix by parsing the access logs using
logtail with an offset file every minute, and submitting value counts
for $upstream_cache_status to the zabbix server trapper via
zabbix_sender. our rps is 50 or less, so processing this log every
minute isn’t really a big deal yet.

Posted at Nginx Forum: