Ruby Forum NGINX > GeoIP: anyway to pull out lat/lon?

Posted by Phillip B Oldham (Guest)
on 06.05.2008 17:18
Attachment: phill.vcf (262 Bytes)
(Received via mailing list)
Hi guys

I've just got the geo module working, and I've geo2nginx.pl'd the
maxmind geolite country data which is working great. However, for our
application I need to get more information (specifically a lat/lon) back
from nginx to pass to PHP.

Any way I can do this? Or is the geo module too simple for this task?
--

*Phillip B Oldham*
The Activity People
phill@theactivitypeople.co.uk <mailto:phill@theactivitypeople.co.uk>

------------------------------------------------------------------------

*Policies*

This e-mail and its attachments are intended for the above named
recipient(s) only and may be confidential. If they have come to you in
error, please reply to this e-mail and highlight the error. No action
should be taken regarding content, nor must you copy or show them to 
anyone.

This e-mail has been created in the knowledge that Internet e-mail is
not a 100% secure communications medium, and we have taken steps to
ensure that this e-mail and attachments are free from any virus. We must
advise that in keeping with good computing practice the recipient should
ensure they are completely virus free, and that you understand and
observe the lack of security when e-mailing us.
Posted by Igor Sysoev (Guest)
on 06.05.2008 17:45
(Received via mailing list)
On Tue, May 06, 2008 at 04:06:06PM +0100, Phillip B Oldham wrote:

> I've just got the geo module working, and I've geo2nginx.pl'd the 
> maxmind geolite country data which is working great. However, for our 
> application I need to get more information (specifically a lat/lon) back 
> from nginx to pass to PHP.
> 
> Any way I can do this? Or is the geo module too simple for this task?

The geo module simply maps ip to some string. You may use any strings.
Probably you need to modify geo2nginx.pl to process lat/lon.
Posted by Phillip B Oldham (Guest)
on 07.05.2008 17:12
Attachment: phill.vcf (262 Bytes)
(Received via mailing list)
Looking into that further, the data I've got for lat/lon at city level
adds up to 120Mb. I could convert this to a geo.conf file using
geo2nginx.pl, but what sort of impact would this have on nginx running?
Would each child process take up 120Mb ram? Would nginx slow down for
each request having to work through such a large data set?

Igor Sysoev wrote:
>
> The geo module simply maps ip to some string. You may use any strings.
> Probably you need to modify geo2nginx.pl to process lat/lon.
>
>
>   

--

*Phillip B Oldham*
The Activity People
phill@theactivitypeople.co.uk <mailto:phill@theactivitypeople.co.uk>

------------------------------------------------------------------------

*Policies*

This e-mail and its attachments are intended for the above named
recipient(s) only and may be confidential. If they have come to you in
error, please reply to this e-mail and highlight the error. No action
should be taken regarding content, nor must you copy or show them to 
anyone.

This e-mail has been created in the knowledge that Internet e-mail is
not a 100% secure communications medium, and we have taken steps to
ensure that this e-mail and attachments are free from any virus. We must
advise that in keeping with good computing practice the recipient should
ensure they are completely virus free, and that you understand and
observe the lack of security when e-mailing us.
Posted by Igor Sysoev (Guest)
on 07.05.2008 21:28
(Received via mailing list)
On Wed, May 07, 2008 at 03:59:06PM +0100, Phillip B Oldham wrote:

> Looking into that further, the data I've got for lat/lon at city level 
> adds up to 120Mb. I could convert this to a geo.conf file using 
> geo2nginx.pl, but what sort of impact would this have on nginx running? 
> Would each child process take up 120Mb ram? Would nginx slow down for 
> each request having to work through such a large data set?

We are using 141240 lines geo file:

wc geo.conf
  141240  282480 2979471 geo.conf

Could you show yours ? Also could you show pair lines of the file ?

Performance should not depend on file size if you have enough memory.
For geo map workers use the same memory inherited from parent on copy
on write basis. But as there are no writes to this memory, it remain
the same. Also duplicate values are stored only once.

What does

awk '{print $2}' geo.conf | sort | uniq | wc -l

show ?
Posted by Phillip B Oldham (Guest)
on 08.05.2008 09:00
Attachment: phill.vcf (262 Bytes)
(Received via mailing list)
I've not created the geo.conf file yet from the city data, but will do
so today and do some checks.

My main worry is that nginx is running on a virtual machine with only
300MB of ram. So if I've got a file around the 100MB mark and I've one
child, wouldn't that mean 200MB of memory is being taken up? Even 100MB
of memory is a lot just to get a lat/lon of the visitor.

Igor Sysoev wrote:
> We are using 141240 lines geo file:
>
>>>
>>>>         
>>> The geo module simply maps ip to some string. You may use any strings.
>>> Probably you need to modify geo2nginx.pl to process lat/lon.
>>>       
>
>
>   

--

*Phillip B Oldham*
The Activity People
phill@theactivitypeople.co.uk <mailto:phill@theactivitypeople.co.uk>

------------------------------------------------------------------------

*Policies*

This e-mail and its attachments are intended for the above named
recipient(s) only and may be confidential. If they have come to you in
error, please reply to this e-mail and highlight the error. No action
should be taken regarding content, nor must you copy or show them to 
anyone.

This e-mail has been created in the knowledge that Internet e-mail is
not a 100% secure communications medium, and we have taken steps to
ensure that this e-mail and attachments are free from any virus. We must
advise that in keeping with good computing practice the recipient should
ensure they are completely virus free, and that you understand and
observe the lack of security when e-mailing us.
Posted by Phillip B Oldham (Guest)
on 08.05.2008 10:33
Attachment: phill.vcf (262 Bytes)
(Received via mailing list)
Ok, I've done part of the conversion, and here's what I get:

# wc -l geo.conf
3937100 geo.conf

Which is a little larger than the 141240 you're using. Will such a large
file slow nginx down?

Phillip B Oldham wrote:
>>
>>
>>
>>>>      
>>>>       
>>
>>
>>   
>

--

*Phillip B Oldham*
The Activity People
phill@theactivitypeople.co.uk <mailto:phill@theactivitypeople.co.uk>

------------------------------------------------------------------------

*Policies*

This e-mail and its attachments are intended for the above named
recipient(s) only and may be confidential. If they have come to you in
error, please reply to this e-mail and highlight the error. No action
should be taken regarding content, nor must you copy or show them to 
anyone.

This e-mail has been created in the knowledge that Internet e-mail is
not a 100% secure communications medium, and we have taken steps to
ensure that this e-mail and attachments are free from any virus. We must
advise that in keeping with good computing practice the recipient should
ensure they are completely virus free, and that you understand and
observe the lack of security when e-mailing us.
Posted by Phillip B Oldham (Guest)
on 08.05.2008 10:39
Attachment: phill.vcf (262 Bytes)
(Received via mailing list)
# awk '{print $2}' geo.conf | sort | uniq | wc -l
119313

So am I right in thinking that nginx only stores each unique value with
a mapping to the ip range(s)? If so, 119313 isn't too bad, and comes in
a shade under your list.

Igor Sysoev wrote:
> We are using 141240 lines geo file:
>
>>>
>>>>         
>>> The geo module simply maps ip to some string. You may use any strings.
>>> Probably you need to modify geo2nginx.pl to process lat/lon.
>>>       
>
>
>   

--

*Phillip B Oldham*
The Activity People
phill@theactivitypeople.co.uk <mailto:phill@theactivitypeople.co.uk>

------------------------------------------------------------------------

*Policies*

This e-mail and its attachments are intended for the above named
recipient(s) only and may be confidential. If they have come to you in
error, please reply to this e-mail and highlight the error. No action
should be taken regarding content, nor must you copy or show them to 
anyone.

This e-mail has been created in the knowledge that Internet e-mail is
not a 100% secure communications medium, and we have taken steps to
ensure that this e-mail and attachments are free from any virus. We must
advise that in keeping with good computing practice the recipient should
ensure they are completely virus free, and that you understand and
observe the lack of security when e-mailing us.
Posted by Igor Sysoev (Guest)
on 08.05.2008 11:18
(Received via mailing list)
On Thu, May 08, 2008 at 09:22:58AM +0100, Phillip B Oldham wrote:

> Ok, I've done part of the conversion, and here's what I get:
> 
> # wc -l geo.conf
> 3937100 geo.conf
> 
> Which is a little larger than the 141240 you're using. Will such a large 
> file slow nginx down?

As I understand it should not. nginx uses radix tree for geo, the first 
6
or 7 bits are available in single TLB miss and up to 6/7 data cache 
misses.
If the most entries are /24 networks, then you will get 18/17 TLB and 
data
cache misses in worst case.

> >>On Wed, May 07, 2008 at 03:59:06PM +0100, Phillip B Oldham wrote:
> >>We are using 141240 lines geo file:
> >>What does
> >>>> 
> >>>>Probably you need to modify geo2nginx.pl to process lat/lon.
> phill@theactivitypeople.co.uk <mailto:phill@theactivitypeople.co.uk>
> This e-mail has been created in the knowledge that Internet e-mail is 
> not a 100% secure communications medium, and we have taken steps to 
> ensure that this e-mail and attachments are free from any virus. We must 
> advise that in keeping with good computing practice the recipient should 
> ensure they are completely virus free, and that you understand and 
> observe the lack of security when e-mailing us.
> 
> ------------------------------------------------------------------------
Posted by Igor Sysoev (Guest)
on 08.05.2008 11:23
(Received via mailing list)
On Thu, May 08, 2008 at 09:31:27AM +0100, Phillip B Oldham wrote:

> # awk '{print $2}' geo.conf | sort | uniq | wc -l
> 119313
> 
> So am I right in thinking that nginx only stores each unique value with 
> a mapping to the ip range(s)? If so, 119313 isn't too bad, and comes in 
> a shade under your list.

Yes, nginx stores uniq values only.

But memory is also required for the radix tree itself.

> >
> >the same. Also duplicate values are stored only once.
> >>>On Tue, May 06, 2008 at 04:06:06PM +0100, Phillip B Oldham wrote:
> >>>>   
> *Phillip B Oldham*
> should be taken regarding content, nor must you copy or show them to anyone.
> 
> This e-mail has been created in the knowledge that Internet e-mail is 
> not a 100% secure communications medium, and we have taken steps to 
> ensure that this e-mail and attachments are free from any virus. We must 
> advise that in keeping with good computing practice the recipient should 
> ensure they are completely virus free, and that you understand and 
> observe the lack of security when e-mailing us.
> 
> ------------------------------------------------------------------------