Rewrite lookup via MySQL table

Any chance I can use a MySQL database table to do a lookup to return the
url to rewrite to?

Let me explain the situation first…

Magento utilizes rewrite rules heavily but does the majority of the
rewrite within the Front controller. The problem is that it is VERY
in-efficient and a store with more than 10,000 products is severely
delay. But, all information is referenced in a table ‘core_url_rewrite’
so my though it so do the lookup first in Nginx and hand off to Magento
with the non-seo url.

Any idea/though would be GREATLY appreciated.

Brian Tafoya
Prema Solutions
www.premasolutions.com

Current environment:

Two dedicated servers, one web, one db, both with ridiculous amounts of
memory and high speed drives.

nginx
php-fpm
php 5.3.3
apc

Posted at Nginx Forum:

you could cron a dump of the database into nginx rewrite rules, and
then reload nginx…

otherwise, mysql lookups will block, and most nginx modules are
designed with non-blocking in mind

Hi,

Any chance I can use a MySQL database table to do a lookup to return the
url to rewrite to?

Yes, you can do this with ngx_drizzle combined with ngx_lua.

agentzh will probably reply with working example soon enough :wink:

Best regards,
Piotr S. < [email protected] >

On Tue, Aug 24, 2010 at 8:25 AM, Michael S. [email protected]
wrote:

otherwise, mysql lookups will block, and most nginx modules are
designed with non-blocking in mind

mysql lookups do not necessarily block if you do not use libmysql.

Our ngx_drizzle module can talk with mysql 5.1+ without blocking the
nginx process and it’s been heavily used in production. See
GitHub - openresty/drizzle-nginx-module: an nginx upstream module that talks to mysql and drizzle by libdrizzle for more details.

Cheers,
-agentzh

You are a crazy, crazy guy agentzh! :slight_smile:

On Mon, Aug 23, 2010 at 8:32 PM, agentzh [email protected] wrote:

Magento utilizes rewrite rules heavily but does the majority of the
rewrite within the Front controller. The problem is that it is VERY
in-efficient and a store with more than 10,000 products is severely
delay. But, all information is referenced in a table ‘core_url_rewrite’
so my though it so do the lookup first in Nginx and hand off to Magento
with the non-seo url.

You could also whip up a quick little PHP script to do the
pre-processing for you, not bootstrapping Magento, but simply db
querying, and add memcache or apc or another caching mechanism,
without having to go this deep into development modules and having to
compile a bunch of external modules to get this to work :slight_smile:

On Tue, Aug 24, 2010 at 8:22 AM, btafoya [email protected] wrote:

with the non-seo url.

Here’s a sample configuration that has been tested on my laptop.

Consider you seo uri is /baz, the true uri is /foo/bar. And I have the
following table in my local mysql “test” database:

create table my_url_map(id serial, url text, seo_url);
insert into my_url_map(url, seo_url)values(‘/foo/bar’, ‘/baz’);

And I build my nginx 0.8.41 this way:

./configure \
            --add-module=/path/to/ngx_devel_kit \
            --add-module=/path/to/set-misc-nginx-module \
            --add-module=/path/to/ngx_http_auth_request_module-0.2 \
            --add-module=/path/to/echo-nginx-module \
            --add-module=/path/to/lua-nginx-module \
            --add-module=/path/to/drizzle-nginx-module \
            --add-module=/path/to/rds-json-nginx-module

Also, I have lua 5.1.4 and the lua-yajl library installed to my system.

And here’s the central part in my nginx.conf:

upstream backend {
    drizzle_server 127.0.0.1:3306 dbname=test
         password=some_pass user=monty protocol=mysql;
    drizzle_keepalive max=300 mode=single overflow=ignore;
}

lua_package_cpath '/path/to/your/lua/yajl/library/?.so';

server {
      ...

    location /conv-mysql {
        internal;
        set_quote_sql_str $seo_uri $query_string; # to prevent sql 

injection
drizzle_query “select url from my_url_map where
seo_url=$seo_uri”;
drizzle_pass backend;
rds_json on;
}

    location /conv-uid {
        internal;
        content_by_lua_file 'html/foo.lua';
    }

    location /jump {
        internal;
        rewrite ^ $query_string? redirect;
    }

    # your SEO uri
    location /baz {
        set $my_uri $uri;
        auth_request /conv-uid;

        echo_exec /jump $my_uri;
    }

Then let’s access /baz from the client side:

$ curl -i localhost:1984/baz
HTTP/1.1 302 Moved Temporarily
Server: nginx/0.8.41 (without pool)
Date: Tue, 24 Aug 2010 03:28:42 GMT
Content-Type: text/html
Content-Length: 176
Location: http://localhost:1984/foo/bar
Connection: keep-alive

<html>
<head><title>302 Found</title></head>
<body bgcolor="white">
<center><h1>302 Found</h1></center>
<hr><center>nginx/0.8.41 (without pool)</center>
</body>
</html>

We can optimize this further by introducing ngx_srcache, ngx_memc and
memcached clusters with connection pooling to cache our mysql result
sets. But I’d keep this sample config minimal for now :slight_smile: If anybody is
interested, just follow up and ask here :slight_smile:

Note that 0.8.42+ won’t work for this config, and the request will
hang for reasons that I don’t know yet. We do not have the resource to
track all the aggressive changes in recent nginx core and that’s no
fun at all.

Cheers,
-agentzh

On Tue, Aug 24, 2010 at 11:40 AM, Michael S. [email protected]
wrote:

You are a crazy, crazy guy agentzh! :slight_smile:

:slight_smile:

You could also whip up a quick little PHP script to do the
pre-processing for you, not bootstrapping Magento, but simply db
querying, and add memcache or apc or another caching mechanism,
without having to go this deep into development modules and having to
compile a bunch of external modules to get this to work :slight_smile:

Yup, sure we can do that. Just depend on how fast you want to achieve
and how much resource you want to spend on those services :wink:

On my laptop, nginx + php-fpm can only get about 4000 r/s for a hello
world php script and once you start to connect to a database and do
some simple calculations, oh well…

Cheers,
-agentzh

On Tue, Aug 24, 2010 at 11:32 AM, agentzh [email protected] wrote:

Here’s a sample configuration that has been tested on my laptop.

[snip]

Connection: keep-alive

Just a quick note regarding performance: on my ThinkPad T400 (Core2Duo
T9600) laptop, with only one nginx worker, and both ab and mysqld
running on the same machine, and no memcached and no result set
caching we get:

Server Software:        nginx/0.8.41
Server Hostname:        localhost
Server Port:            1984

Document Path:          /baz
Document Length:        176 bytes

Concurrency Level:      10
Time taken for tests:   1.812 seconds
Complete requests:      10000
Failed requests:        0
Write errors:           0
Non-2xx responses:      10000
Total transferred:      3900000 bytes
HTML transferred:       1760000 bytes
Requests per second:    5518.47 [#/sec] (mean)
Time per request:       1.812 [ms] (mean)
Time per request:       0.181 [ms] (mean, across all concurrent 

requests)
Transfer rate: 2101.76 [Kbytes/sec] received

And with ngx_memc + ngx_srcache enabled, we can achieve 6200+ r/s with
the same settings.

Cheers,
-agentzh

On Tue, Aug 24, 2010 at 11:32 AM, agentzh [email protected] wrote:

   location /conv-uid {
       internal;
       content_by_lua_file 'html/foo.lua';
   }

Sorry, forgot to paste the contents of my html/foo.lua here, and it’s
the essential glue:

-- foo.lua
local yajl = require('yajl')
local seo_uri = ngx.var.my_uri
local res = ngx.location.capture('/conv-mysql?' .. seo_uri)
if (res.status ~= ngx.HTTP_OK) then
    ngx.throw_error(res.status)
end
res = yajl.to_value(res.body)
if (not res or not res[1] or not res[1].url) then
    ngx.throw_error(ngx.HTTP_INTERNAL_SERVER_ERROR)
end
ngx.var.my_uri = res[1].url;

This piece of Lua code is mostly self-explanatory if you do know a
little Lua :wink:

Tests have shown that enabling LuaJIT 2.0 here won’t make things
noticeably faster because we do not have enough Lua for JIT’ing :wink:

Have fun!
-agentzh

That is more along the lines of what I was thinking Mike, but have no
experience doing so with Magento. There is always a first for everything
though!

Thanks!

Posted at Nginx Forum:

sounds like it’s just some db queries. so really you’ll intercept it
in nginx, pass it off to php, and then redirect those requests (is it
a 301 redirect?) to the proper URL.

it’s a very basic PHP/mysql setup. always start simple, so add in
caching once things work…

Very cool! I am going to try some pre-processing in PHP first and see
how that goes. If that does not work, I will look into your suggestion.

Thanks VERY much!

Posted at Nginx Forum:

Forgot to show what the target path is now:

http://fcdev.premasolutions.com/catalog/product/view/id/9746/category/743

btafoya Wrote:

loop. The handler I wrote is index_loader.php.

location ~ \.php/ { ## Forward paths like

/js/index.php/x.js to relevant handler
rewrite ^(.*.php)/ $1 last;
}

How do I go about redirecting all .html traffic to
the new handler without interfering with the
Magento front controller?

Thanks again for everyone’s help!

Posted at Nginx Forum:

I have the php code working to redirect correctly, but have run into a
obstacle I don’t know how to resolve…

The is the incoming URL, or one of them…

http://fcdev.premasolutions.com/alice-in-wonderland-movie-deluxe-alice-adult-costume.html

I have tried to intercept it but just created a loop. The handler I
wrote is index_loader.php.

location ~ \.html/ { ## Forward
    rewrite ^(.*)/ index_loader.php?source_url=$1 last;
}

location @handler { ## Magento uses a common front handler
    rewrite / /index.php;
}

location ~ \.php/ { ## Forward paths like /js/index.php/x.js to

relevant handler
rewrite ^(.*.php)/ $1 last;
}

How do I go about redirecting all .html traffic to the new handler
without interfering with the Magento front controller?

Thanks again for everyone’s help!

Posted at Nginx Forum:

email me off list i can try to help - mike503@gmail

On Tue, Aug 24, 2010 at 11:32 AM, agentzh [email protected] wrote:

Note that 0.8.42+ won’t work for this config, and the request will
hang for reasons that I don’t know yet. We do not have the resource to
track all the aggressive changes in recent nginx core and that’s no
fun at all.

This is a false alarm. My config pasted above should work for nginx
0.8.42+ (as far as nginx 0.8.49 in fact) :slight_smile:

I was fooled by my local use of ngx_srcache (in subrequests) which was
known to hang due to a “regression” in nginx 0.8.42+ explained below:

http://forum.nginx.org/read.php?29,103078,103078

Cheers,
-agentzh

First of all, this thread contains a lot of information! I’m in the
process of implementing something similar and it has helped me a ton.

I was just wondering, as long as you’re using Lua anyway, why not do
away with Drizzle completely, and just make Lua connect to the database
to retrieve the correct URL? Is it because Drizzle will keep one
connection to the database per nginx-instance, so there is less
overhead, compared to Lua which would have to connect and disconnect to
the database for every request?

Posted at Nginx Forum:

On Sat, Aug 28, 2010 at 8:54 PM, relix [email protected] wrote:

First of all, this thread contains a lot of information! I’m in the
process of implementing something similar and it has helped me a ton.

I was just wondering, as long as you’re using Lua anyway, why not do
away with Drizzle completely, and just make Lua connect to the database
to retrieve the correct URL?

Lua’s database library will block the nginx process which is
unacceptable for real world applications. ngx_drizzle is served as a
non-blocking mysql driver component for Lua here (of cause, it can
also work with ngx_http_js_module and others).

Is it because Drizzle will keep one
connection to the database per nginx-instance, so there is less
overhead,

Actually ngx_drizzle can keep connection pools to mysql backends per
nginx instance. One db connection per process is the suboptimal
PHP/Perl way of doing things and it can waste a lot of database
resources.

compared to Lua which would have to connect and disconnect to
the database for every request?

MUCH more than just that :wink:

Cheers,
-agentzh

I am new to nginx,
but, as far as I know,
nginx is known to be a fast light webserver.

So, having each rewrite to be looked up in a mysql table seems
something very slow idea.
rewrite is something to be done on the fly, as fast as possible. and
putting a mysql layer in this process seems not a good idea.

Ah, thanks, that does make sense!

agentzh Wrote:

to retrieve the correct URL?
connection to the database per nginx-instance,


nginx mailing list
[email protected]
nginx Info Page

Posted at Nginx Forum: