Question about traffic statistics on each vhost,is it possible?

hi,i wanna write a module to perform traffic statistic on every
vhost,but
after i read the document:“Guide to Nginx Module
Developmenthttp://www.riceonfire.org/emiller/nginx-modules-guide.html
and some code in /nginx-0.6.13/src/http/modules/,i found that i still
have
no idea…

is it possible to write a new module to do this work?or this feature is
not
supported by current nginx?

sinnis r ha scritto:

hi,i wanna write a module to perform traffic statistic on every
vhost,but after i read the document:“Guide to Nginx Module Development
http://www.riceonfire.org/emiller/nginx-modules-guide.html” and some
code in /nginx-0.6.13/src/http/modules/,i found that i still have no idea…

is it possible to write a new module to do this work?or this feature is
not supported by current nginx?

Not sure, but you should be able to do this by writing an header filter
(for a possible setup for an Host) and a body filter to count data
written.

See the ngx_http_chunked_filter module.

A possible “complication” is the fact that Nginx uses multiple worker
processes.

Regards Manlio P.

Any update on that? That would be a really good feature to have for
reselling bandwidth usage to customers.

It is in the feature request page, but doesn’t have any info:
http://wiki.codemongers.com/NginxFeatureRequests

Still no news on this module for traffic statistics? How do you guys
do it currently? Could I use Squid to monitor the traffic? I want to
be able to resell traffic usage.

Hello!

On Wed, Oct 01, 2008 at 06:21:11PM +0200, Thomas wrote:

Still no news on this module for traffic statistics? How do you guys
do it currently? Could I use Squid to monitor the traffic? I want to
be able to resell traffic usage.

Why not just log needed numbers ($bytes_sent) to access_log and
post-process it with anything you want?

Maxim D.

On Wed, Oct 1, 2008 at 9:43 AM, Maxim D. [email protected] wrote:

Hello!
Why not just log needed numbers ($bytes_sent) to access_log and post-process
it with anything you want?

I might be forced to do this right now. That means millions of lines
per day on 3 servers to combine, merge and calculate…

I’m sure it would be simple for someone to write a module that dumps
out Host: header stats to a file every so often (one per day) …
fingers crossed

this is one thing i am hoping for

it would be nice to have number of bytes and number of http requests
per each Host: header

beefing up the options for monitoring and statistics has been a
requested by many people… so hopefully someone can jump on that soon
:slight_smile:

Hello!

On Wed, Oct 01, 2008 at 12:06:11PM -0700, mike wrote:

fingers crossed
It’s not simple “dump out Host: header”, it’s shared memory,
locking mutex for each request, loosing stats on binary upgrade,
etc, etc.

I see no reason why this can’t be done from access_logs by simple
perl script - either running side-by-side with nginx but perfectly
restarable whenever you want, or periodically run at low-load
periods.

We used to run similar script for calculation of average
$upstream_response_time for monitoring purposes and as far as I
recall it took less than 1% cpu on not-really-fast machine with
more than 10 mln log lines per day.

Maxim D.

Why not just log needed numbers ($bytes_sent) to access_log and post-process
it with anything you want?

Maxim D.

Clever idea, I never thought that $bytes_sent + custom script could do
the trick. Thank’s a lot.

mike wrote:

script - either running side-by-side with nginx but perfectly restarable
customer’s bandwidth usage properly.

Previously there was a syslog patch sent to the list (which I hope might
one day get committed… here’s hoping…).

With a bit of ingenuity you can syslog to some other machine which
simply eats the syslog and computes some stats.

Alternatively it’s not rocket science to write a little perl demon which
just tails your log file and generates stats. It might even work to
have nginx write to some kind of named pipe and have your perl demon
consume the log file without spooling to disk at all (what happens if
something falls over though… Writing to disk gives you some soft
coupling of systems)

Good luck

Ed W

In my log_format declaration I have added $host in order to track the
bandwith usage on a per vhsot basis, however it doesn’t get logged,
what could be going wrong?

On Thu, Oct 2, 2008 at 9:12 AM, Ed W [email protected] wrote:

over though… Writing to disk gives you some soft coupling of systems)
nah, i already thought of that, it’d be the same approach using
something like ulog-acctd and counting the actual bytes per ip…

a simple access log with only the information i need would be the best
option.

basically unless the daemon itself is summarizing the info or
something it makes no sense to me to create another pipe/daemon/etc.
that is being fed the info constantly, i’d rather just batch it once
per day at that point :stuck_out_tongue:

thanks.

Sorry for the previous email, I was simply not naming the log file
correctly.

In the meantime I have spotted a very strange behaviour: there are
some polish servers that are constantly trying to spam my server,
there is specially one which is annoying: grapx.pl

When I look into my log files, I see that the $host is set to
grapx.pl! How is that possible? The domain name is not pointing to my
server IP because when I try to look at the prax.pl webpage in firefox
I get redirected to another crappy domain name sedoparking.com. What
in hell is going on?

Also sometimes I see that my own server is trying to connect to its
nginx server, so I see $host set to 91.my.ip.address. Is that normal
behaviour?

On Wed, Oct 1, 2008 at 1:43 PM, Maxim D. [email protected] wrote:

Hello!

lines per day.
Hmm, well, I’ve tried to get away from having to parse logfiles… but
this is something I have to implement so I can account for my
customer’s bandwidth usage properly.

I don’t just want $bytes_sent don’t I want both sent/received, if
someone uploads a 50 meg file, the only way to get that is to get both
sides of the request right?

I was just used to zeus having a stats daemon that would write out
hourly or some other periodic way to a file each host: header, number
of requests and number of bytes and I had a nightly job run by and
parse it - very quick since it was only a few hundred lines, not a few
million.

Anyway, I looked into using iptables and traffic accounting but it
looks like that will generate even more work and I can’t seem to get
the iptables rules to work with my LVS-NAT setup properly anyway.

Thanks for replying though :slight_smile:

this really isn’t related to nginx… but if you have someone’s
website being pointed to you, put up some pay-per-click or google ads
on there and make a few bucks. if the site is actually used, they’ll
notice pretty quickly. otherwise, it’s free money for someone being
lazy.

This is getting bad! There is another polish website that appears in
my $host list, and this bloody website is actually pointing to my
server!!! I can trace myself clicking around in their website, and I
see it appear in my nginx log files!!!

Who the f**k would want to do that? Are they trying to intercept some
of my customers? What’s happening!?

Yeah but I’d like to know why on Earth a lazy guy would point his
domain name to my IP address. I don’t want to be the victim of a hack
or exploit.

I have a returning customer system based on login+password, I guess
this lazy bastard is trying to intercept the credentials as the submit
of a form would redirect to the .pl website where his server could try
to figure out whether or not it is intercepting credentials or should
redirect to my website.

Sorry for being off topic, but this is a problem that any Nginx could
encounter so I was looking for someone’s experience.

On Sat, Oct 04, 2008 at 09:04:31PM +0200, Thomas wrote:

in hell is going on?
This is probably error in some scripts.

Also sometimes I see that my own server is trying to connect to its
nginx server, so I see $host set to 91.my.ip.address. Is that normal
behaviour?

If request has no Host header, then nginx sets $host to a server IP
address.
It’s better to log $http_host.

On Sat, Oct 04, 2008 at 09:15:08PM +0200, Thomas wrote:

This is getting bad! There is another polish website that appears in
my $host list, and this bloody website is actually pointing to my
server!!! I can trace myself clicking around in their website, and I
see it appear in my nginx log files!!!

Who the f**k would want to do that? Are they trying to intercept some
of my customers? What’s happening!?

Have no idea what they want (may be some XSS).
You can prevent unwanted names using:

    server {
         listen 80 default;
         server_name  _;      # invalid name, catch all
         return 404;
    }

    server {
         listen 80;
         server_name  www.site.com
                      site.com
                      alias.site.com
                      your.ip.address
                      ""      # request without host header 

(0.7.12+)
;

}
}

    server {
         listen 80;
         server_name  other.site.com;
         ...
    }

Thank you Igor, the exploit has been fixed.