Slow down, but not stop, serving pages to bot that doesn't respect robots.txt delay

Hi all.

I want to differentiate the nginx awesome rate limiting for requests, in
order to enforce bots to respect my directives (but not to block them):

so i created these limit_req_zone:

limit_req_zone $binary_remote_addr zone=antiddosspider:1m rate=1r/m;
limit_req_zone $binary_remote_addr zone=antiddosphp:1m rate=1r/s;
limit_req_zone $binary_remote_addr zone=antiddosstatic:1m rate=10r/s;

Now, i ask, is it possible to configure something like this?

if ( $http_user_agent ~* (?:bot|spider) ) {
limit_req zone=antiddosspider burst=1;
}

location / {
            limit_req zone=antiddosphp burst=100;
            proxy_pass http://localhost:8181;
            include /etc/nginx/proxy.conf;
        }

I don’t know many spider that are crawling my site, but i don’t wanna
lose the possibility to be indexed if they are not malware! …But 1
page for minute please :wink:

Best regards,
Stefano

Posted at Nginx Forum:

with this conf i get:
Testing nginx configuration: [emerg]: “limit_req” directive is not
allowed here in /etc/nginx/sites-enabled/default:52

Posted at Nginx Forum:

I dont believe you can use limi_req within an if statement. You might
have
to rewire the if to a different location and then apply the rule:

Example:

if ( $http_user_agent ~* (?:bot|spider) ) {
error_page 403 = @bots;
return 403;
}

location @bots {
limit_req zone=antiddosspider burst=1;
}

GL.

Rami

Thank you, but i want to serve the real content to the bot not an error
page.

Is this possible?

Posted at Nginx Forum:

What’s this @ thing mean?

@location is a named location. Named locations preserve $uri as it was
before entering such location. They were introduced in 0.6.6 and can be
reached only via
error_pagehttp://wiki.nginx.org/NginxHttpCoreModule#error_page
, post_action
http://wiki.nginx.org/NginxHttpCoreModule#post_action(since
0.6.26) and try_files
http://wiki.nginx.org/NginxHttpCoreModule#try_files (since
0.7.27, backported to 0.6.36)."

Essentially, you can still server any content you want using that
syntax, it
is just a way to redirect nginx without rewriting the $uri.