Forum: NGINX Rewrite lookup via MySQL table

Posted by btafoya (Guest)
on 2010-08-24 02:24
(Received via mailing list)
Any chance I can use a MySQL database table to do a lookup to return the
url to rewrite to?

Let me explain the situation first...

Magento utilizes rewrite rules heavily but does the majority of the
rewrite within the Front controller. The problem is that it is VERY
in-efficient and a store with more than 10,000 products is severely
delay. But, all information is referenced in a table 'core_url_rewrite'
so my though it so do the lookup first in Nginx and hand off to Magento
with the non-seo url.

Any idea/though would be GREATLY appreciated.

Brian Tafoya
Prema Solutions
www.premasolutions.com


Current environment:

Two dedicated servers, one web, one db, both with ridiculous amounts of
memory and high speed drives.

nginx
php-fpm
php 5.3.3
apc

Posted at Nginx Forum: 
http://forum.nginx.org/read.php?2,122977,122977#msg-122977
Posted by Michael Shadle (Guest)
on 2010-08-24 02:26
(Received via mailing list)
you could cron a dump of the database into nginx rewrite rules, and
then reload nginx...

otherwise, mysql lookups will block, and most nginx modules are
designed with non-blocking in mind
Posted by Piotr Sikora (Guest)
on 2010-08-24 02:44
(Received via mailing list)
Hi,

> Any chance I can use a MySQL database table to do a lookup to return the
> url to rewrite to?

Yes, you can do this with ngx_drizzle combined with ngx_lua.

agentzh will probably reply with working example soon enough ;)

Best regards,
Piotr Sikora < piotr.sikora@frickle.com >
Posted by agentzh (Guest)
on 2010-08-24 04:14
(Received via mailing list)
On Tue, Aug 24, 2010 at 8:25 AM, Michael Shadle <mike503@gmail.com> 
wrote:
>
> otherwise, mysql lookups will block, and most nginx modules are
> designed with non-blocking in mind
>

mysql lookups do not necessarily block if you do not use libmysql.

Our ngx_drizzle module can talk with mysql 5.1+ without blocking the
nginx process and it's been heavily used in production. See
http://github.com/chaoslawful/drizzle-nginx-module for more details.

Cheers,
-agentzh
Posted by agentzh (Guest)
on 2010-08-24 05:33
(Received via mailing list)
On Tue, Aug 24, 2010 at 8:22 AM, btafoya <nginx-forum@nginx.us> wrote:
> with the non-seo url.
>

Here's a sample configuration that has been tested on my laptop.

Consider you seo uri is /baz, the true uri is /foo/bar. And I have the
following table in my local mysql "test" database:

   create table my_url_map(id serial, url text, seo_url);
   insert into my_url_map(url, seo_url)values('/foo/bar', '/baz');

And I build my nginx 0.8.41 this way:

    ./configure \
                --add-module=/path/to/ngx_devel_kit \
                --add-module=/path/to/set-misc-nginx-module \
                --add-module=/path/to/ngx_http_auth_request_module-0.2 \
                --add-module=/path/to/echo-nginx-module \
                --add-module=/path/to/lua-nginx-module \
                --add-module=/path/to/drizzle-nginx-module \
                --add-module=/path/to/rds-json-nginx-module

Also, I have lua 5.1.4 and the lua-yajl library installed to my system.

And here's the central part in my nginx.conf:

    upstream backend {
        drizzle_server 127.0.0.1:3306 dbname=test
             password=some_pass user=monty protocol=mysql;
        drizzle_keepalive max=300 mode=single overflow=ignore;
    }

    lua_package_cpath '/path/to/your/lua/yajl/library/?.so';

    server {
          ...

        location /conv-mysql {
            internal;
            set_quote_sql_str $seo_uri $query_string; # to prevent sql 
injection
            drizzle_query "select url from my_url_map where 
seo_url=$seo_uri";
            drizzle_pass backend;
            rds_json on;
        }

        location /conv-uid {
            internal;
            content_by_lua_file 'html/foo.lua';
        }

        location /jump {
            internal;
            rewrite ^ $query_string? redirect;
        }

        # your SEO uri
        location /baz {
            set $my_uri $uri;
            auth_request /conv-uid;

            echo_exec /jump $my_uri;
        }

Then let's access /baz from the client side:

    $ curl -i localhost:1984/baz
    HTTP/1.1 302 Moved Temporarily
    Server: nginx/0.8.41 (without pool)
    Date: Tue, 24 Aug 2010 03:28:42 GMT
    Content-Type: text/html
    Content-Length: 176
    Location: http://localhost:1984/foo/bar
    Connection: keep-alive

    <html>
    <head><title>302 Found</title></head>
    <body bgcolor="white">
    <center><h1>302 Found</h1></center>
    <hr><center>nginx/0.8.41 (without pool)</center>
    </body>
    </html>

We can optimize this further by introducing ngx_srcache, ngx_memc and
memcached clusters with connection pooling to cache our mysql result
sets. But I'd keep this sample config minimal for now :) If anybody is
interested, just follow up and ask here :)

Note that 0.8.42+ won't work for this config, and the request will
hang for reasons that I don't know yet. We do not have the resource to
track all the aggressive changes in recent nginx core and that's no
fun at all.

Cheers,
-agentzh
Posted by Michael Shadle (Guest)
on 2010-08-24 05:41
(Received via mailing list)
You are a crazy, crazy guy agentzh! :)

On Mon, Aug 23, 2010 at 8:32 PM, agentzh <agentzh@gmail.com> wrote:

>> Magento utilizes rewrite rules heavily but does the majority of the
>> rewrite within the Front controller. The problem is that it is VERY
>> in-efficient and a store with more than 10,000 products is severely
>> delay. But, all information is referenced in a table 'core_url_rewrite'
>> so my though it so do the lookup first in Nginx and hand off to Magento
>> with the non-seo url.

You could also whip up a quick little PHP script to do the
pre-processing for you, not bootstrapping Magento, but simply db
querying, and add memcache or apc or another caching mechanism,
without having to go this deep into development modules and having to
compile a bunch of external modules to get this to work :)
Posted by agentzh (Guest)
on 2010-08-24 05:42
(Received via mailing list)
On Tue, Aug 24, 2010 at 11:32 AM, agentzh <agentzh@gmail.com> wrote:
> Here's a sample configuration that has been tested on my laptop.
>
[snip]
>    Connection: keep-alive
>

Just a quick note regarding performance: on my ThinkPad T400 (Core2Duo
T9600) laptop, with only one nginx worker, and both ab and mysqld
running on the same machine, and no memcached and no result set
caching we get:

    Server Software:        nginx/0.8.41
    Server Hostname:        localhost
    Server Port:            1984

    Document Path:          /baz
    Document Length:        176 bytes

    Concurrency Level:      10
    Time taken for tests:   1.812 seconds
    Complete requests:      10000
    Failed requests:        0
    Write errors:           0
    Non-2xx responses:      10000
    Total transferred:      3900000 bytes
    HTML transferred:       1760000 bytes
    Requests per second:    5518.47 [#/sec] (mean)
    Time per request:       1.812 [ms] (mean)
    Time per request:       0.181 [ms] (mean, across all concurrent 
requests)
    Transfer rate:          2101.76 [Kbytes/sec] received

And with ngx_memc + ngx_srcache enabled, we can achieve 6200+ r/s with
the same settings.

Cheers,
-agentzh
Posted by agentzh (Guest)
on 2010-08-24 05:56
(Received via mailing list)
On Tue, Aug 24, 2010 at 11:40 AM, Michael Shadle <mike503@gmail.com> 
wrote:
> You are a crazy, crazy guy agentzh! :)
>

:)

>
> You could also whip up a quick little PHP script to do the
> pre-processing for you, not bootstrapping Magento, but simply db
> querying, and add memcache or apc or another caching mechanism,
> without having to go this deep into development modules and having to
> compile a bunch of external modules to get this to work :)
>

Yup, sure we can do that. Just depend on how fast you want to achieve
and how much resource you want to spend on those services ;)

On my laptop, nginx + php-fpm can only get about 4000 r/s for a hello
world php script and once you start to connect to a database and do
some simple calculations, oh well...

Cheers,
-agentzh
Posted by agentzh (Guest)
on 2010-08-24 06:04
(Received via mailing list)
On Tue, Aug 24, 2010 at 11:32 AM, agentzh <agentzh@gmail.com> wrote:
>
>        location /conv-uid {
>            internal;
>            content_by_lua_file 'html/foo.lua';
>        }
>

Sorry, forgot to paste the contents of my html/foo.lua here, and it's
the essential glue:

    -- foo.lua
    local yajl = require('yajl')
    local seo_uri = ngx.var.my_uri
    local res = ngx.location.capture('/conv-mysql?' .. seo_uri)
    if (res.status ~= ngx.HTTP_OK) then
        ngx.throw_error(res.status)
    end
    res = yajl.to_value(res.body)
    if (not res or not res[1] or not res[1].url) then
        ngx.throw_error(ngx.HTTP_INTERNAL_SERVER_ERROR)
    end
    ngx.var.my_uri = res[1].url;

This piece of Lua code is mostly self-explanatory if you do know a 
little Lua ;)

Tests have shown that enabling LuaJIT 2.0 here won't make things
noticeably faster because we do not have enough Lua for JIT'ing ;)

Have fun!
-agentzh
Posted by btafoya (Guest)
on 2010-08-24 06:04
(Received via mailing list)
That is more along the lines of what I was thinking Mike, but have no
experience doing so with Magento. There is always a first for everything
though!

Thanks!

Posted at Nginx Forum: 
http://forum.nginx.org/read.php?2,122977,123018#msg-123018
Posted by btafoya (Guest)
on 2010-08-24 06:06
(Received via mailing list)
Very cool! I am going to try some pre-processing in PHP first and see
how that goes. If that does not work, I will look into your suggestion.

Thanks VERY much!

Posted at Nginx Forum: 
http://forum.nginx.org/read.php?2,122977,123019#msg-123019
Posted by Michael Shadle (Guest)
on 2010-08-24 06:12
(Received via mailing list)
sounds like it's just some db queries. so really you'll intercept it
in nginx, pass it off to php, and then redirect those requests (is it
a 301 redirect?) to the proper URL.

it's a very basic PHP/mysql setup. always start simple, so add in
caching once things work...
Posted by btafoya (Guest)
on 2010-08-24 20:29
(Received via mailing list)
I have the php code working to redirect correctly, but have run into a
obstacle I don't know how to resolve....

The is the incoming URL, or one of them...

http://fcdev.premasolutions.com/alice-in-wonderland-movie-deluxe-alice-adult-costume.html

I have tried to intercept it but just created a loop. The handler I
wrote is index_loader.php.

    location ~ \.html/ { ## Forward
        rewrite ^(.*)/ index_loader.php?source_url=$1 last;
    }

    location @handler { ## Magento uses a common front handler
        rewrite / /index.php;
    }

    location ~ \.php/ { ## Forward paths like /js/index.php/x.js to
relevant handler
        rewrite ^(.*\.php)/ $1 last;
    }

How do I go about redirecting all .html traffic to the new handler
without interfering with the Magento front controller?

Thanks again for everyone's help!

Posted at Nginx Forum: 
http://forum.nginx.org/read.php?2,122977,123227#msg-123227
Posted by btafoya (Guest)
on 2010-08-24 20:42
(Received via mailing list)
Forgot to show what the target path is now:

http://fcdev.premasolutions.com/catalog/product/view/id/9746/category/743

btafoya Wrote:
-------------------------------------------------------
> loop. The handler I wrote is index_loader.php.
> 
>     location ~ \.php/ { ## Forward paths like
> /js/index.php/x.js to relevant handler
>         rewrite ^(.*\.php)/ $1 last;
>     }
> 
> How do I go about redirecting all .html traffic to
> the new handler without interfering with the
> Magento front controller?
> 
> Thanks again for everyone's help!

Posted at Nginx Forum: 
http://forum.nginx.org/read.php?2,122977,123234#msg-123234
Posted by Michael Shadle (Guest)
on 2010-08-25 07:09
(Received via mailing list)
email me off list i can try to help - mike503@gmail
Posted by agentzh (Guest)
on 2010-08-25 07:50
(Received via mailing list)
On Tue, Aug 24, 2010 at 11:32 AM, agentzh <agentzh@gmail.com> wrote:
>
> Note that 0.8.42+ won't work for this config, and the request will
> hang for reasons that I don't know yet. We do not have the resource to
> track all the aggressive changes in recent nginx core and that's no
> fun at all.
>

This is a false alarm. My config pasted above should work for nginx
0.8.42+ (as far as nginx 0.8.49 in fact) :)

I was fooled by my local use of ngx_srcache (in subrequests) which was
known to hang due to a "regression" in nginx 0.8.42+ explained below:

    http://forum.nginx.org/read.php?29,103078,103078

Cheers,
-agentzh
Posted by relix (Guest)
on 2010-08-28 14:55
(Received via mailing list)
First of all, this thread contains a lot of information! I'm in the
process of implementing something similar and it has helped me a ton.

I was just wondering, as long as you're using Lua anyway, why not do
away with Drizzle completely, and just make Lua connect to the database
to retrieve the correct URL? Is it because Drizzle will keep one
connection to the database per nginx-instance, so there is less
overhead, compared to Lua which would have to connect and disconnect to
the database for every request?

Posted at Nginx Forum: 
http://forum.nginx.org/read.php?2,122977,124586#msg-124586
Posted by agentzh (Guest)
on 2010-08-30 05:21
(Received via mailing list)
On Sat, Aug 28, 2010 at 8:54 PM, relix <nginx-forum@nginx.us> wrote:
> First of all, this thread contains a lot of information! I'm in the
> process of implementing something similar and it has helped me a ton.
>
> I was just wondering, as long as you're using Lua anyway, why not do
> away with Drizzle completely, and just make Lua connect to the database
> to retrieve the correct URL?

Lua's database library will block the nginx process which is
unacceptable for real world applications. ngx_drizzle is served as a
non-blocking mysql driver component for Lua here (of cause, it can
also work with ngx_http_js_module and others).

> Is it because Drizzle will keep one
> connection to the database per nginx-instance, so there is less
> overhead,

Actually ngx_drizzle can keep connection pools to mysql backends per
nginx instance. One db connection per process is the suboptimal
PHP/Perl way of doing things and it can waste a lot of database
resources.

> compared to Lua which would have to connect and disconnect to
> the database for every request?
>

MUCH more than just that ;)

Cheers,
-agentzh
Posted by bvidinli (Guest)
on 2010-08-30 12:33
(Received via mailing list)
I am new to nginx,
bug, as far as I know,
nginx is known to be a fast light webserver.

So, having each rewrite to be looked up in a mysql table seems
something very slow idea.
rewrite is something to be done on the fly, as fast as possible. and
putting a mysql layer in this process seems not a good idea.
Posted by bvidinli (Guest)
on 2010-08-30 12:33
(Received via mailing list)
I am new to nginx,
 but, as far as I know,
 nginx is known to be a fast light webserver.

 So, having each rewrite to be looked up in a mysql table seems
 something very slow idea.
 rewrite is something to be done on the fly, as fast as possible. and
 putting a mysql layer in this process seems not a good idea.
Posted by relix (Guest)
on 2010-08-30 15:05
(Received via mailing list)
Ah, thanks, that does make sense!

agentzh Wrote:
-------------------------------------------------------
> > to retrieve the correct URL?
> > connection to the database per nginx-instance,
> 
> _______________________________________________
> nginx mailing list
> nginx@nginx.org
> http://nginx.org/mailman/listinfo/nginx

Posted at Nginx Forum: 
http://forum.nginx.org/read.php?2,122977,125201#msg-125201
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.