Masking IP-Addresses on logging?

Hey there,

some of you might know that storing full ip addresses in germany is
getting more tricky every day and considered a legal problem.
Is there any way to store masked ip addresses in access logs with nginx
while retaining the possibility to use web analytics software?

thanks,

thomas

Posted at Nginx Forum:
http://forum.nginx.org/read.php?2,55369,55369#msg-55369

It’s humorous when politicians try to regulate the openness of the
Internet. Well, at least until it starts affecting you in ways like
this.

You can change the log format with something like this:

server {

log_format no_ip '0.0.0.0 - no-user [$time_local] ’
'"$request" $status $body_bytes_sent ’
‘"$http_referer" “$http_user_agent”’;
access_log logs/access.log no_ip;
}

See also http://wiki.nginx.org/NginxHttpLogModule

Note that in my example, I still write out a bogus IP (normally
$remote_addr) and “no-user” (normally $remote_user, which I’m guessing
might be problematic for you as well) so that existing log parsers
won’t trip on your new format.

As for analytics software, you should still be able to use it without
any problem – you’ll just lose the location info that’s normally
determined based on the IP. If you want to keep that, you should be
able to use the geo module and write the country code (or lat/long
maybe?) to your log file as long as your analytics package can parse
and properly handle the extra data.

Nick P. wrote:

It’s humorous when politicians try to regulate the openness of the Internet.

Humorous wouldn’t be my choice of words.

You can change the log format with something like this:

log_format no_ip '0.0.0.0 - no-user [$time_local] ’

No. This will break most log-based analytics software, which use the
source address to tell between different visitors, in order to generate
‘visit’, ‘page’, and ‘hit’ statistics.

I guess the only feasible way to have some kind of plausible denial in
court and to get your statistics is to use the result of a one-way
hash function as a fake IP address.

MD4 and MD5 would do a fine job, but they are probably way too heavy to
use on a web server. CRC-32 is lighter and outputs a 32bit code, which
could be reformatted as a fake IP address.

I suggest writing a custom module that will compute the CRC-32 of
“$remote_address$http_user_agent” (just to get some variance over the IP
address alone), then format it in the dotted-decimal form of IP
addresses and make it available as $fake_remote_address.

This won’t keep you from reversing the code. No function will, in this
case, not even MD5, because you could always brute-force your way
through, as you only have 2^32 plain texts. If you used the User-agent
as an input to the hash function and not store it in the log, then it
would get slightly better… But I guess CRC-32 will be enough for any
lawyer :slight_smile:

Tobia

On Thu, 2010-02-18 at 07:39 -0500, ts77 wrote:

Hey there,

some of you might know that storing full ip addresses in germany is getting more tricky every day and considered a legal problem.
Is there any way to store masked ip addresses in access logs with nginx while retaining the possibility to use web analytics software?

Disclaimer: I’m not a lawyer and I’m not from Germany.

That being said, from my reading of the situation, I don’t think there’s
an issue with storing the IP address so long as it’s only kept as long
as actually needed (i.e. for billing purposes or in your case, log
analysis). Why not just do your analysis within some reasonable
time-frame and then delete or scrub the raw data?

As to those who think that these laws are silly, I have three words for
you: Senator Joseph McCarthy. It really ought not to be necessary to
have such laws, but of course we in the information fields can’t be
trusted to not hoard data that puts people in peril. I applaud Germany
and the E.U. for putting their citizen’s privacy above corporate
interests.

Regards,
Cliff

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs