How nginx's "location if" works (Was Re: Setting cache parameters via if directives)

On Tue, Feb 1, 2011 at 11:45 PM, Ryan M. [email protected]
wrote:

It does in fact work in production on nginx 0.7.6x. Below is my actual
configuration (trimmed to the essentials and with a few substitutions
of actual URIs).

Well, ngx_proxy module’s directive inheritance is in action here,
which gives you nice side effects that you want :slight_smile:

I’ll analyze some examples here such that people may get some light.

[Case 1]

location /proxy {
    set $a 32;
    if ($a = 32) {
        set $a 56;
    }
    set $a 76;
    proxy_pass http://127.0.0.1:$server_port/$a;
}

location ~ /(\d+) {
    echo $1;
}

Calling /proxy gives 76 because it works in the following steps:

  1. Nginx runs all the rewrite phase directives in the order that
    they’re in the config file, i.e.,

     set $a 32;
     if ($a = 32) {
         set $a 56;
     }
     set $a 76;
    

    and $a gets the final value of 76.

  2. Nginx traps into the “if” inner block because its condition $a = 32
    was met in step 1.

  3. The inner block does not has any content handler, ngx_proxy
    inherits the content handler (that of ngx_proxy) in the outer scope
    (see src/http/modules/ngx_http_proxy_module.c:2025).

  4. Also the config specified by proxy_pass also gets inherited by the
    inner “if” block (see src/http/modules/ngx_http_proxy_module.c:2015)

  5. Request terminates (and the control flow never goes outside of the
    “if” block).

That is, the proxy_pass directive in the outer scope will never run in
this example. It is “if” inner block that actually serves you.

Let’s see what happens when we override the inner “if” block’s content
handler with out own:

[Case 2]

location /proxy {
    set $a 32;
    if ($a = 32) {
        set $a 56;
        echo "a = $a";
    }
    set $a 76;
    proxy_pass http://127.0.0.1:$server_port/$a;
}
location ~ /(\d+) {
    echo $1;
}

You will get this while accessing /proxy:

a = 76

Looks counter-intuitive? Oh, well, let’s see what’s happening this time:

  1. Nginx runs all the rewrite phase directives in the order that
    they’re in the config file, i.e.,

     set $a 32;
     if ($a = 32) {
         set $a 56;
     }
     set $a 76;
    

    and $a gets the final value of 76.

  2. Nginx traps into the “if” inner block because its condition $a = 32
    was met in step 1.

  3. The inner block does has a content handler specified by “echo”,
    then the value of $a (76) gets emitted to the client side.

  4. Request terminates (and the control flow never goes outside of the
    “if” block), as in Case 1.

We do have a choice to make Case 2 work as we like:

[Case 3]

location /proxy {
set $a 32;
if ($a = 32) {
set $a 56;
break;

        echo "a = $a";
    }
    set $a 76;
    proxy_pass http://127.0.0.1:$server_port/$a;
}
location ~ /(\d+) {
    echo $1;
}

This time, we just add a “break” directive inside the if block. This
will stop nginx from running the rest ngx_rewrite directives. So we
get

a = 56

So this time, nginx works this way:

  1. Nginx runs all the rewrite phase directives in the order that
    they’re in the config file, i.e.,

     set $a 32;
     if ($a = 32) {
         set $a 56;
         break;
     }
    

    and $a gets the final value of 56.

  2. Nginx traps into the “if” inner block because its condition $a = 32
    was met in step 1.

  3. The inner block does has a content handler specified by “echo”,
    then the value of $a (56) gets emitted to the client side.

  4. Request terminates (and the control flow never goes outside of the
    “if” block), just as in Case 1.

Okay, you see how ngx_proxy module’s config inheritance among nested
locations take the key role here, and make you believe it works the
way that you want. But other modules (like “echo” mentioned in one of
my earlier emails) may not inherit content handlers in nested
locations (in fact, most content handler modules, including upstream
ones, don’t).

And one must be careful about bad side effects of config inheritance
of “if” blocks in other cases, consider the following example:

[Case 5]

location /proxy {
    set $a 32;
    if ($a = 32) {
        return 404;
    }
    set $a 76;
    proxy_pass http://127.0.0.1:$server_port/$a;
    more_set_headers "X-Foo: $a";
}
location ~ /(\d+) {
    echo $1;
}

Here, ngx_header_more’s “more_set_headers” will also be inherited by
the implicit location created by the “if” block. So you will get:

curl localhost/proxy
HTTP/1.1 404 Not Found
Server: nginx/0.8.54 (without pool)
Date: Mon, 14 Feb 2011 05:24:00 GMT
Content-Type: text/html
Content-Length: 184
Connection: keep-alive
X-Foo: 32

which may or may not what you want :slight_smile:

BTW, the “add_header” directive will not emit a “X-Foo” header in this
case, and it does not mean no directive inheritance happens here, but
add_header’s header filter will skip 404 responses.

You see, how tricky it is behind the scene! No wonder people keep
saying “nginx’s if is evil”.

Cheers,
-agentzh

Disclaimer: There may be other corner cases that I’ve missed here, and
other more knowledgeable people can correct me wherever I’m wrong :slight_smile:

On 14 Fev 2011 05h27 WET, [email protected] wrote:

if ($a = 32) {
Calling /proxy gives 76 because it works in the following steps:
and $a gets the final value of 76.
the
content handler with out own:
proxy_pass http://127.0.0.1:$server_port/$a;
time:
and $a gets the final value of 76.
“if” block), as in Case 1.

will stop nginx from running the rest ngx_rewrite directives. So we
if ($a = 32) {
3. The inner block does has a content handler specified by “echo”,
locations (in fact, most content handler modules, including upstream
return 404;
the implicit location created by the “if” block. So you will get:
which may or may not what you want :slight_smile:

Disclaimer: There may be other corner cases that I’ve missed here,
and other more knowledgeable people can correct me wherever I’m
wrong :slight_smile:

Thank you for elaborating on this issue further. Lets see if I can
summarize my understanding, right now.

  1. Inheritance in if blocks (which are in fact implicit locations) can
    happen only for a few modules, like proxy, fastcgi and your
    headers_more. Inheritance is the exception and not the rule.
    Most modules don’t provide content phase handlers inheritance.

  2. As soon as there’s a matching location, the rewrite phase starts
    and runs. As long as all conditions on the ifs are true the rewrite
    phase directives are always processed, no matter what they are,
    rewrites or variable assignments or break or returns.

  3. This differs only on the content phase. There it can happen that
    inner if blocks can inherit content handlers from outer blocks.

  4. If there’s content handler inside the if block that’s the one
    that’s used. It may happen that the module that provides this
    handler can inherit content handlers from the outer blocks.

Analyzing your examples:

Case 1: There’s no content handler inside the if block, but the proxy
module provides inheritance hence we get the value 76.

Case 2: There’s a content handler inside the if block. The echo module
doesn’t provide inheritance of directives in outer
blocks. Hence the value is 76 not because of the proxy_pass
directive but because of the echo “a = $a” directive.

Case 3: The if block has a break that interrupts the control flow. So
the set $a 76 directive is never processed. We get 56 because
of the content handler provided by the echo module inside the
if block.

Case 4: (You named it 5, but it’s a typo)

    The return directive inside the if block provides a special
    response: 404. The output filter that the headers_more module
    provides is inherited by the if and gets sent in the reply.

Now I see why in the If is Evil… when used in location context | NGINX page the only
safe directives are return and rewrites with a last flag. Either one
of them “break” the control flow in the rewrite phase. Therefore are
not susceptible to inheritance issues.

Is this correct?

Thanks,
— appa

Hi,

On 15/02/2011 00:22, Antnio P. P. Almeida wrote:

On 14 Fev 2011 05h27 WET, [email protected] wrote:

Disclaimer: There may be other corner cases that I’ve missed here,
and other more knowledgeable people can correct me wherever I’m
wrong :slight_smile:
agentzh - well done on bringing this topic up - I’m sure doing so will
help enlighten many who aren’t so familiar with Nginx internals.
Thank you for elaborating on this issue further. Lets see if I can
summarize my understanding, right now.

  1. Inheritance in if blocks (which are in fact implicit locations) can
    happen only for a few modules, like proxy, fastcgi and your
    headers_more. Inheritance is the exception and not the rule.
    Most modules don’t provide content phase handlers inheritance.
    I would say that inheritance of configuration options is generally the
    rule rather than the exception, but there are a lot of configuration
    options that just aren’t permitted inside if blocks. However,
    inheritance of the content handler (which is what you’re really talking
    about) is a subset of directive inheritence in general.

Most, though not all, modules that generate content (‘upstream’ modules
like fastcgi, proxy etc can be considered as ‘generating’ content) use
the same content handler system. The zip module is another one. (For
anyone familiar with C but not with Nginx internals: each location has a
core location configuration on which a pointer to the content handler is
saved, and this pointer is copied to the inherited if blocks unless
another content handler directive is set inside the if block).

Modules that deal with checking authentication (e.g. Maxim’s
auth_request module) or filtering the response (e.g. the SSI module)
don’t inherit the content handler, but the configuration options may
(and usually are) inherited into the if block. The only times they are
not merged is if there is no merging defined for the value of a
particular directive (but in most cases there is, for all types of
configuration).

  1. As soon as there’s a matching location, the rewrite phase starts
    and runs.
    This depends on what you mean by ‘matching’. There are rules which
    governing which location will be matched - see
    http://wiki.nginx.org/HttpCoreModule#location for details. Which
    location is matched may not be the first location listed that matches
    the URL. Matching for nested locations is handled before rewrite-phase
    handling.
    As long as all conditions on the ifs are true the rewrite
    phase directives are always processed, no matter what they are,
    rewrites or variable assignments or break or returns.
    As soon as a ‘break’ or ‘return’ directive is reached in the rewrite
    phase, rewrite processing is stopped in the current block (returns will
    normally stop all processing, but may not depending on settings for
    error_page etc - which is probably a little beyond the scope of
    discussion here). If ‘rewrite’ is reached with a ‘last’ or ‘break’,
    then processing also stops.
  2. This differs only on the content phase. There it can happen that
    inner if blocks can inherit content handlers from outer blocks.
    All the scripted rewrite directives (if, set, rewrite, break and return)
    are processed during the rewrite phase, which happens before the content
    phase. What happens is that if an ‘if’ block is reached, i.e. the
    conditions in the if conditional equate to true, then the ‘location’ for
    the content is set to the contents of that if block (unless a subsequent
    if block is also true - in which case that block is used instead and the
    previously true if block, which is then completely ignored - or the
    request reaches a different location - e.g. by a rewrite).

The content is created from the directives that apply in one and only
one ‘location’ - which can be a standard location or an if block.

  1. If there’s content handler inside the if block that’s the one
    that’s used. It may happen that the module that provides this
    handler can inherit content handlers from the outer blocks.
    All content handlers that use the standard mechanism for setting the
    content handler (I don’t know if there are any that don’t, but I don’t
    know any that don’t use the standard way) are inherited in the same way.
    Analyzing your examples:

Case 1: There’s no content handler inside the if block, but the proxy
module provides inheritance hence we get the value 76.
Yes - the value of 76 is echoed by the subrequest
Case 2: There’s a content handler inside the if block.
Yes
The echo module
doesn’t provide inheritance of directives in outer
blocks.
This is irrelevant. The ‘echo’ content handler overrides the content
handler from the proxy module in the ‘outer’ location. Inheritance of
content handlers is not handled on a per-module basis, it’s handled in
the core http module.
Hence the value is 76 not because of the proxy_pass
directive but because of the echo “a = $a” directive.
Yes
Case 3: The if block has a break that interrupts the control flow. So
the set $a 76 directive is never processed.
Yes.
We get 56 because
of the content handler provided by the echo module inside the
if block.
Yes.
Case 4: (You named it 5, but it’s a typo)

     The return directive inside the if block provides a special
     response: 404.

Yes.

The output filter that the headers_more module
provides is inherited by the if and gets sent in the reply.
Yes, and the value of 32 is used and not 76 because the set $a 76 is not
processed because the ‘return’ is reached first.
Now I see why in the If is Evil… when used in location context | NGINX page the only
safe directives are return and rewrites with a last flag.
The ‘set’ directive, as well as any set_xxx directives that use the
Nginx Development Kit (NDK) (e.g. those in the set_misc module) are also
fine, since they plug into the same rewrite mechanism.

The main problem is that most of the directives in Nginx are
declarative, and the order in which you place them in the config file
(within the same block) isn’t important. Most of the time these
directives are inherited into sub-locations (real locations or if
blocks), and it’s probably an unintentional mistake by the developer if
they aren’t inherited. The rewrite directives, however, are procedural,
and the order can be (and often is) important. Within a location (that
is any ‘normal’ locations and any ‘if’ blocks inside), all the scripted
rewrite directives (if, set, break, return, rewrite and any set_xxx
directives that use the NDK - including set_by_lua) are processed in the
order they are written - regardless of whether they appear inside an if
block or not (if’s can’t appear inside other if’s). With respect to
parsing rewrite directives, locations and the if blocks location inside
those locations are considered as one (but locations within locations
are not).

Also, it’s important to realise that the ‘if’ conditions and block
branching only happens at the rewrite stage. It is safe to mix ‘set’,
‘set_by_lua’, ‘set_md5’ etc. directives, and they will probably do what
you expect, because they are processed in order - all at the rewrite
stage. If you start mixing ‘set’ with content handlers (e.g.
content_by_lua), then it’s probably a good idea to put all your
rewrite-phase directives before your content-phase directives in your
config files.

Either one
of them “break” the control flow in the rewrite phase.
Correct.
Therefore are
not susceptible to inheritance issues.
See above.

Hope that helps,

Marcus.

On Sun, Feb 13, 2011 at 11:27 PM, agentzh [email protected] wrote:

You see, how tricky it is behind the scene! No wonder people keep
saying “nginx’s if is evil”.

I very much see why “if is evil”, but unfortunately there seems to be
almost no alternative if you want to take some action based on things
in headers or other variables.

So what is a generally safe way to handle control flow where the
conditions are based on something that is not in the URI (making
rewrite useless)?

Note that changing the back-end application to handle things is often
not possible when using commercial web applications behind nginx.

I guess this sort of thing could always be a job for a general-purpose
web server like Apache. Nginx is meant to be simple and lightweight;
perhaps the general design is at odds with complex flow control.


RPM

Ryan M. Wrote:

I guess this sort of thing could always be a job
for a general-purpose
web server like Apache. Nginx is meant to be
simple and lightweight;
perhaps the general design is at odds with complex
flow control.

On the contrary, it seems to me that one cannot even begin to
contemplate the possibilities for complex flow control within Nginx when
using Apache.

Posted at Nginx Forum:

Ryan M. Wrote:

So what is a generally safe way to handle control
flow where the
conditions are based on something that is not in
the URI (making
rewrite useless)?

Have a look at this:

May be useful

Posted at Nginx Forum:

On Wed, Feb 16, 2011 at 3:00 AM, Ryan M. [email protected]
wrote:

I very much see why “if is evil”, but unfortunately there seems to be
almost no alternative if you want to take some action based on things
in headers or other variables.

So what is a generally safe way to handle control flow where the
conditions are based on something that is not in the URI (making
rewrite useless)?

We’ve been using the ngx_lua module to do such complicated nginx.conf
branching (and also the whole application’s business logic) in Lua.
Lua’s “if” is not evil anyway.

Note that changing the back-end application to handle things is often
not possible when using commercial web applications behind nginx.

Indeed.

For our own business, the only “back-end application” is the mysql,
tokyotyrant, and memcached clusters (as well as many other such “true
backends”). We do use nginx as the “web application server” (in
contrast with the “web server”), and use Lua as the “application
language” :slight_smile:

I guess this sort of thing could always be a job for a general-purpose
web server like Apache. Nginx is meant to be simple and lightweight;
perhaps the general design is at odds with complex flow control.

Lua is a lightweight language and even backend daemons like Mysql
Proxy and TokyoTyrant embed it. LuaJIT 2.0 has made it even lighter :wink:

Besides, for ngx_lua’s set_by_lua directive, there’s even no Lua
coroutine overhead (though the overhead itself is very small).

BTW, I did not say that you should never use nginx’s if. Don’t take
me wrong. My motivation of writing this explanation of the underlying
mechanism is to help you use it correctly and wisely :wink:

I think Igor S. will redesign the whole rewrite module in his
nginx 2.0 devel branch. Then everything will be changed.

Cheers,
-agentzh

On 16.02.2011 05:25, agentzh wrote:

For our own business, the only “back-end application” is the mysql,
tokyotyrant, and memcached clusters (as well as many other such “true
backends”). We do use nginx as the “web application server” (in
contrast with the “web server”), and use Lua as the “application
language”:slight_smile:
Hello,

that sound really interested. Please can you give me a hint how you use
Lua as applicaton language? How do you merge for example a Headline from
mysql/redis into your html code? Is it possible to use something like
templates? Or replace a token like ##Headline## in static html pages?

With content_by_lua i can wonderful output the data, but what about the
html code?

Kind regards

Alexander

On Fri, Feb 18, 2011 at 7:54 PM, agentzh [email protected] wrote:

local data = get_data_from_remote()
local html = generate_html_from_template(“/path/to/my/template”)

Sorry, the last line should have been

local html = generate_html_from_template(data,
“/path/to/my/template”)

Cheers,
-agentzh

On Fri, Feb 18, 2011 at 7:19 PM, Alexander K. [email protected]
wrote:

Hello,

that sound really interested. Please can you give me a hint how you use Lua
as applicaton language? How do you merge for example a Headline from
mysql/redis into your html code?

Our own applications are typical RIAs, that is, from the point of view
of MVC, the View and Controller are completely running on the client
side, i.e., in our user’s web browser, kinda like Gmail :slight_smile: We’ve been
using a pure-client-side templating system and compiler called
Jemplate [1].

The basic model looks like this:

  1. nginx just serves static files like html, js, css, jpg for the web
    app backbone that loads into the user’s web browser and runs.

  2. The client side JavaScript code issues (cross-site) AJAX requests
    to our web service servers which also runs another set of nginx
    instances which rely on ngx_drizzle, ngx_lua, and etc to efficiently
    emit JSON formatted data.

  3. The client side JS code put the data with the (compiled) templates
    and get the final HTML fragments that are ready to be put into a node
    in the HTML DOM (or XML for the flash components). Then the users see
    page regions updated.

Maybe we can call it Service-Oriented Applications? Oh well…

And yeah, RIAs hate most search engine crawlers. Some of our module
users are combining ngx_lua with ngx_ctpp2 [2] for efficient server
side templating where ctpp2 [3] is a templating engine written in C++.
But I’ve never tried ngx_ctpp2 myself (yet) and I may roll out my own
implementation of a server-side template engine for Perl’s TT2
templating language in pure C some time in the future :wink:

Is it possible to use something like
templates? Or replace a token like ##Headline## in static html pages?

Sure, see above. Also, any templating engines for Lua can be used
directly. The basic steps are:

local data = get_data_from_remote()
local html = generate_html_from_template("/path/to/my/template")
ngx.print(html)  -- emit it!

With content_by_lua i can wonderful output the data, but what about the html
code?

Why can’t you just regard HTML as a special form of “data”? :wink:

Cheers,
-agentzh

[1] Jemplate - JavaScript Templating with Template Toolkit - metacpan.org
[2] svn://vbart.ru/ngx_ctpp2
[3] CTPP - Wikipedia

You could do that, or you could just write the session handling code in
Lua and output the page with the login box conditionally, or the non
logged in version, you don’t have to redirect. More options!

Justin

Thanks so much for the detailed answer.

So i think i am on the right way using html + jquery. The HTML page
loads all my css + js + static data and jquery loads the content with
ajax calls. The Browser builds the page. Nginx + Lua + Redis and the
whole application flys… Great :slight_smile:

Time to think a bit different :slight_smile: The “old” way still stuck in my head. I
see there is a function access_by_lua but i still like session. Could
this the way to handle access to all files in app? Decript session and
encrypt session for refreshing the session livetime? Or shoud i use
access_by_lua in each location?

 location /app {

     set_decode_base32 $session $cookie_SID;
     set_decrypt_session $raw $session;
     set_encrypt_session $session $raw;
     set_encode_base32 $session;
     add_header Set-Cookie "SID=$session; path=/";

     if ($raw = '') {
         rewrite (.*) /relogin.htm?url=$1 redirect;
     }

     try_files $uri $uri/;
 }

Have a nice day, agentzh.

Alexander

On Fri, Feb 18, 2011 at 8:29 PM, Alexander K. [email protected]
wrote:

Time to think a bit different :slight_smile: The “old” way still stuck in my head. I see
there is a function access_by_lua but i still like session. Could this the
way to handle access to all files in app?

Sure :slight_smile:

Decrypt session and encrypt
session for refreshing the session livetime?

Yeah :slight_smile:

Or shoud i use access_by_lua in
each location?

Well, this is a valid option too: you can combine
ngx_encrypted_session and access_by_lua by calling
ngx_ecrypted_session’s config directives directly from within Lua,
like this:

access_by_lua ’
local encrypted_text =
ndk.set_var.set_decode_base32(ngx.var.arg_session)
if not encrypted_text or encypted_text == “” then
return ngx.redirect(“/relogin.htm?url=” …
ngx.escape_uri(ngx.var.uri))
end

   local raw_text = ndk.set_var.set_decrypt_session(encrypted_text)
   if not raw_text or raw_text == "" then
       return ngx.redirect("/relogin.htm?url=" ..

ngx.escape_uri(ngx.var.uri))
end

   -- validate raw_text is indeed valid...

   -- then refresh the sessions:
   local encrypted_text = ndk.set_var.set_encrypt_session(raw_text)
   local value = ndk.set_var.set_encode_base32(encrypted_text)
   ngx.header["Set-Cookie"] = { "SID=" .. value .. "; path=/" }

';

A very useful feature in ngx_lua is the “ndk.set_var.xxx” magic that
allows you to call some other nginx C modules’ config directives
on-the-fly! There’s a restriction though: the 3rd-party directives
must be implemented using NDK (Nginx Devel Kit)'s set_var submodule’s
ndk_set_var_value mechanism :wink:

Have a nice day, agentzh.

You too! :wink:

Cheers,
-agentzh

Good Morning Agentzh,

thanks so much, again :slight_smile: I will play around with ndk.set_var. I think
now i have all parts together for building
a whole webapp only with nginx + lua and a datastorage :slight_smile: cool.

If it runs, i will give you a feedback.

Have a nice weekend.

Alexander