Proxy pass location inheritance

Hello, we are using NGINX to serve a combination of local and proxied
content coming from both an Apache server (mostly PHP content) and IIS
7.5 (a handful of third party .Net applications). The proxy is working
properly for the pages themselves, but we wanted set up a separate
location block for the “static” files (js, images, etc) to use different
caching rules. In theory, each of the static file location blocks
should be serving from the location specified in its parent location
block, but instead ALL image requests are being routed to the root
block.

Server A: Contains the root site and all sorts of images.
Server B: Contains applications in specific folders, and each folder has
local images.

A simplified version of our server block:

upstream server_a {server 10.64.1.10:80;}
upstream server_b {server 10.64.1.20:80;}

server {
listen 80;
server_name www.site.edu;
#some irrelevant proxy, cache, and header code goes here

root location

location / {
proxy_cache_valid 200 301 302 304 10m; #content changes regularly
proxy_cache_use_stale error timeout updating;
expires 60m;
proxy_pass http://server_a;

#this is the location for "static" content in the root. It is being 

called for ALL static files of these types
location ~* .(css|js|png|jpe?g|gif)$ {
proxy_cache_valid 200 301 302 304 30d;
expires 30d;
proxy_pass http://server_a;
}
}

#.net locations on second server
location ~* /(app1|app2|app3|app4) {
proxy_cache_valid 0s; #no caching in these folders
proxy_pass http://server_b;

#location for static content in these folders. This is not working.
location ~* \.(css|js|png|jpe?g|gif)$ {
  proxy_cache_valid 200 301 302 304 30d;
  expires 30d;
  proxy_pass http://server_b;
}

}
}

Three of the four conditions are working properly.
A request for www.site.edu/index.php gets sent to
10.64.1.10:80/index.php
A request for www.site.edu/image1.gif gets sent to
10.64.1.10:80/default.gif
A request for www.site.edu/app1/default.aspx gets sent to
10.64.1.20:80/app1/default.aspx

But the last condition is not working properly.
A request for www.site.edu/app1/image2.gif should be sent to
10.64.1.20:80/app1/image2.gif.
Instead, it’s being routed to 10.64.1.10:80/app1/image2.gif, which is an
invalid location.

So it appears that the first server location block is catching ALL of
the requests for the static files. Anyone have any idea what I’m doing
wrong?

BH

Hello!

On Thu, Feb 13, 2014 at 06:43:08PM +0000, Brian Hill wrote:

Hello, we are using NGINX to serve a combination of local and
proxied content coming from both an Apache server (mostly PHP
content) and IIS 7.5 (a handful of third party .Net
applications). The proxy is working properly for the pages
themselves, but we wanted set up a separate location block for
the “static” files (js, images, etc) to use different caching
rules. In theory, each of the static file location blocks should
be serving from the location specified in its parent location
block, but instead ALL image requests are being routed to the
root block.

[…]

ALL of the requests for the static files. Anyone have any idea
what I’m doing wrong?

Simplified:

location / {
    location ~ regex1 {
        # regex inside /
    }
}

location ~ regex2 {
    # regex
}

The question is: where a request matching regex1 and regex2 will
be handled?

The answer is - in “location ~ regex1”. Locations given by
regular expressions within a matching prefix location are tested
before other locations given by regular expressions.


Maxim D.
http://nginx.org/

Close, it’s more akin to:

location / {
    location ~ regex1 {
        # regex inside /
    }
}

location ~ regex2 {
    location ~ regex3 {
        # regex inside regex2
    }
}

And the question is: where will a request matching both regex1 and
regex3 be handled?

Regex 1 & 3 look for the same file types and are identical, but contain
different configurations based on the parent location. Currently, regex1
is catching all matches, irrespective of the parent location.

If I understand correctly, I could solve my problem by moving the regex2
location block before the / location block, and then rewriting regex3 so
that it included the elements of both the current regex2 and regex3.
That way, regex3 would ONLY hit for items that matched both the current
regex2 and regex3, and it would appear before regex1 in the order of
execution.

Is this correct, or will NGINX always give priority to the / location?


From: [email protected] [[email protected]] on behalf of
Maxim D. [[email protected]]
Sent: Friday, February 14, 2014 4:19 AM
To: [email protected]
Subject: Re: Proxy pass location inheritance

Hello!

On Thu, Feb 13, 2014 at 06:43:08PM +0000, Brian Hill wrote:

Hello, we are using NGINX to serve a combination of local and
proxied content coming from both an Apache server (mostly PHP
content) and IIS 7.5 (a handful of third party .Net
applications). The proxy is working properly for the pages
themselves, but we wanted set up a separate location block for
the “static” files (js, images, etc) to use different caching
rules. In theory, each of the static file location blocks should
be serving from the location specified in its parent location
block, but instead ALL image requests are being routed to the
root block.

[…]

ALL of the requests for the static files. Anyone have any idea
what I’m doing wrong?

Simplified:

location / {
    location ~ regex1 {
        # regex inside /
    }
}

location ~ regex2 {
    # regex
}

The question is: where a request matching regex1 and regex2 will
be handled?

The answer is - in “location ~ regex1”. Locations given by
regular expressions within a matching prefix location are tested
before other locations given by regular expressions.


Maxim D.
http://nginx.org/


nginx mailing list
[email protected]
http://mailman.nginx.org/mailman/listinfo/nginx

Hello!

On Mon, Feb 17, 2014 at 08:55:02AM +0000, Brian Hill wrote:

        # regex inside regex2
    }
}

And the question is: where will a request matching both regex1
and regex3 be handled?

Much like in the previous case, regex1 is checked first because
it’s inside a prefix location matched. And matching stops once it
matches a request.

Regex 1 & 3 look for the same file types and are identical, but
contain different configurations based on the parent location.
Currently, regex1 is catching all matches, irrespective of the
parent location.

That’s expected behaviour.

If I understand correctly, I could solve my problem by moving
the regex2 location block before the / location block, and then
rewriting regex3 so that it included the elements of both the
current regex2 and regex3. That way, regex3 would ONLY hit for
items that matched both the current regex2 and regex3, and it
would appear before regex1 in the order of execution.

Is this correct, or will NGINX always give priority to the /
location?

No. There is no difference between

location / {  location ~ regex1 { ... } }
location ~ regex2 { ... }

and

location ~ regex2 { ... }
location / {  location ~ regex1 { ... } }

Locations given by regular expressions within a matching prefix
location (not necessary “/”) are always checked first.


Maxim D.
http://nginx.org/

On Mon, Feb 17, 2014 at 08:55:02AM +0000, Brian Hill wrote:

Hi there,

Regex 1 & 3 look for the same file types and are identical, but contain
different configurations based on the parent location. Currently, regex1 is
catching all matches, irrespective of the parent location.

If I understand correctly, I could solve my problem by moving the regex2
location block before the / location block, and then rewriting regex3 so that it
included the elements of both the current regex2 and regex3. That way, regex3
would ONLY hit for items that matched both the current regex2 and regex3, and it
would appear before regex1 in the order of execution.

Is this correct, or will NGINX always give priority to the / location?

Replace the directives inside the regex1 and regex3 locations with
things
like

return 200 “Inside regex1\n”;

and you should be able to test it straightforwardly enough.

Alternatively, the mail you are replying to includes the words

“”"
Locations given by
regular expressions within a matching prefix location are tested
before other locations given by regular expressions.
“”"

If that’s not clear, or if you want to test whether it matches what you
observe, a similar “return” configuration should work too.

(I’d say that your suggestion won’t work as you want it to, because “/”
is
still the best-match prefix location, and therefore regex matches within
“/” will be tested before regex matches outside of that location. You’ll
be happier if you limit yourself to prefix matches at server level.)

Good luck with it,

f

Francis D. [email protected]

So it sounds like my only solution is to restructure the locations to
avoid the original match in /. I don’t have access to the servers again
until tomorrow, but I’m wondering if something like this would work:

location / {
#base content
}

location ~ regex2 {
#alternate folders to proxy_pass from .Net servers
}

location ~ regex3 {
#catch all css, js, images, and other static files

      location ~ regex4 {
                #same as regex2. Alternate static location for .Net 

apps
}
location / {
#match all “static files” not caught by regex4
}
}

If I’m understanding location precedence correctly, the regex3 location
should always hit first, because its regex will contain an exact match
for the file types. The nested regex4 (identical to regex2) will then
match the folder name in that request, so the custom configuration can
be applied only to the regex3 file types contained within the regex4
folders. Requests for the regex3 file types at locations not matching
regex4 will be handled by the nested /.

Will this work, or will the second nested / location break things?


From: [email protected] [[email protected]] on behalf of
Maxim D. [[email protected]]
Sent: Monday, February 17, 2014 5:13 AM
To: [email protected]
Subject: Re: Proxy pass location inheritance

Hello!

On Mon, Feb 17, 2014 at 08:55:02AM +0000, Brian Hill wrote:

        # regex inside regex2
    }
}

And the question is: where will a request matching both regex1
and regex3 be handled?

Much like in the previous case, regex1 is checked first because
it’s inside a prefix location matched. And matching stops once it
matches a request.

Regex 1 & 3 look for the same file types and are identical, but
contain different configurations based on the parent location.
Currently, regex1 is catching all matches, irrespective of the
parent location.

That’s expected behaviour.

If I understand correctly, I could solve my problem by moving
the regex2 location block before the / location block, and then
rewriting regex3 so that it included the elements of both the
current regex2 and regex3. That way, regex3 would ONLY hit for
items that matched both the current regex2 and regex3, and it
would appear before regex1 in the order of execution.

Is this correct, or will NGINX always give priority to the /
location?

No. There is no difference between

location / {  location ~ regex1 { ... } }
location ~ regex2 { ... }

and

location ~ regex2 { ... }
location / {  location ~ regex1 { ... } }

Locations given by regular expressions within a matching prefix
location (not necessary “/”) are always checked first.


Maxim D.
http://nginx.org/


nginx mailing list
[email protected]
http://mailman.nginx.org/mailman/listinfo/nginx

Hello!

On Mon, Feb 17, 2014 at 05:15:45PM +0000, Brian Hill wrote:

location ~ regex3 {
#catch all css, js, images, and other static files

      location ~ regex4 {
                #same as regex2. Alternate static location for .Net apps
      }
      location / {
                 #match all "static files" not caught by regex4
      }

}

This is certainly now how configs should be written, and this
won’t work as regex4 will never match (and nested / will
complain during configuration parsing, but it doesn’t make sense
at all).

If I’m understanding location precedence correctly, the regex3
location should always hit first, because its regex will contain
an exact match for the file types. The nested regex4 (identical
to regex2) will then match the folder name in that request, so
the custom configuration can be applied only to the regex3 file
types contained within the regex4 folders. Requests for the
regex3 file types at locations not matching regex4 will be
handled by the nested /.

Will this work, or will the second nested / location break things?

Try reading Module ngx_http_core_module again, and experimenting
with trival configs to see how it works.

Try to avoid using regular expressions by all means at least till
you’ll understand how it works. It’s very easy to do things wrong
using regular expressions.


Maxim D.
http://nginx.org/

So there is no precedence given to nested regex locations at all? What
value does nesting provide then?

This seems like it should be a fairly simple thing to do. Image/CSS
requests to some folders get handled one way, and image/CSS requests to
all other folders get handled another way. This is an experimental pilot
project for a datacenter conversion, and the use of regex to specify
both the file types and folder names is mandatory. The project this
pilot is for will eventually require more than 50 server blocks with
hundreds of locations in each block if regex cannot be used. It would be
an unmaintainable mess without regex.

Am I missing something here? Is NGINX the wrong solution for what I’m
trying to accomplish? Is there another way to pull this off entirely
within NGINX, or should I be using NGINX in conjunction with something
like HAProxy to route those particular folders where they need to go
(i.e., catch and proxy the .Net folder requests in HAProxy, and pass
everything else along to NGINX?) I was hoping to avoid the use of
HAProxy and handle everything directly within NGINX for the sake of
simplicity, but it’s sounding like that may not be an option.


From: [email protected] [[email protected]] on behalf of
Maxim D. [[email protected]]
Sent: Monday, February 17, 2014 9:30 AM
To: [email protected]
Subject: Re: Proxy pass location inheritance

Hello!

On Mon, Feb 17, 2014 at 05:15:45PM +0000, Brian Hill wrote:

location ~ regex3 {
#catch all css, js, images, and other static files

      location ~ regex4 {
                #same as regex2. Alternate static location for .Net apps
      }
      location / {
                 #match all "static files" not caught by regex4
      }

}

This is certainly now how configs should be written, and this
won’t work as regex4 will never match (and nested / will
complain during configuration parsing, but it doesn’t make sense
at all).

If I’m understanding location precedence correctly, the regex3
location should always hit first, because its regex will contain
an exact match for the file types. The nested regex4 (identical
to regex2) will then match the folder name in that request, so
the custom configuration can be applied only to the regex3 file
types contained within the regex4 folders. Requests for the
regex3 file types at locations not matching regex4 will be
handled by the nested /.

Will this work, or will the second nested / location break things?

Try reading Module ngx_http_core_module again, and experimenting
with trival configs to see how it works.

Try to avoid using regular expressions by all means at least till
you’ll understand how it works. It’s very easy to do things wrong
using regular expressions.


Maxim D.
http://nginx.org/


nginx mailing list
[email protected]
http://mailman.nginx.org/mailman/listinfo/nginx

Are there any performance implications associated with having a large
number of static prefix locations? We really are looking at having
hundreds of location blocks per server if we use static prefixes, and my
primary concern up until now has been maintainability. If I eliminate
maintainability as a concern, the next question that comes up is
performance. How much of a performance hit (if any) will I take if my
config files have 150 or 250 locations per server block, instead of the
5 or 10 that I’ve limited myself to until now? Will the increased
parsing cause any major performance problems?

As I was looking over my config files, it occurred to me that it would
be fairly straightforward for me to write a frontend to generate the
server blocks and locations automatically, which would eliminate my
worries over maintainability. If having a large number of location
blocks isn’t going to harm performance, I may just go that route. If I’d
spent the last few days writing a tool to generate the static config
locations instead of wrestling with regex, I’d be done right now.

nginx stores static prefix locations in some kind of binary tree. This
means that
lookup is fast enough AND the order of the locations does not matter at
all.
The latter allows to create a lot of easy maintainable locations.

Regex locations are processed in the order of appearance. This is slow
and
will become maintenance nightmare when configuration would eventually
grow.
If configuration has to have regex locations it is better to isolate
them
inside static prefix locations.


Igor S.

Hello!

On Mon, Feb 17, 2014 at 09:26:56PM +0000, Brian Hill wrote:

So there is no precedence given to nested regex locations at
all? What value does nesting provide then?

Nesting is to do thins like this:

location / {
# something generic stuff here

   location ~ \.jpg$ {
       expires 1w;
   }

}

location /app1/ {
# something special for app1 here, e.g.
# access control
auth_basic …
access …

   location = /app1/login {
       # something special for /app1/login,
       # eveything from /app1/ is inherited

       proxy_pass ...
   }

   location ~ \.jpg$ {
       expires 1m;
   }

}

location /app2/ {
# separate configuration for app2 here,
# changes in /app1/ doesn’t affect it

   ...

   location ~ \.jpg$ {
       expires 1y;
   }

}

That is, it allows to write scalable configurations using prefix
locations. With such approach, you can edit anything under /app1/
without being concerned how it will affect things for /app2/.

It also allows to use inheritance to write shorter configurations,
and allows to isolate regexp locations within prefix ones.

This seems like it should be a fairly simple thing to do.
Image/CSS requests to some folders get handled one way, and
image/CSS requests to all other folders get handled another way.

See above for an example.

(I personally recommend using separate folder for images/css to be
able to use prefix locations instead of regexp ones. But it
should be relatively safe this way as well - as long as they are
isolated in other locations. On of the big problems with regexp
locations is that ofthen they are matched when people don’t expect
them to be matched, and isolating regexp locations within prefix
ones minimizes this negative impact.)

This is an experimental pilot project for a datacenter
conversion, and the use of regex to specify both the file types
and folder names is mandatory. The project this pilot is for
will eventually require more than 50 server blocks with hundreds
of locations in each block if regex cannot be used. It would be
an unmaintainable mess without regex.

Your problem is that you are trying to mix regex locations
and prefix locations without understanding how they work, and to
make things even harder you add nested locations to the mix.

Instead, just stop doing things harder. Simplify things.

Most recommended simplification is to avoid regexp location. Note
that many location blocks isn’t necessary bad thing. Sometimes
it’s much easier to handle hundreds of prefix location blocks than
dozens of regexp locations. Configuration with prefix locations
are much easier to maintain.

If you can’t avoid regexp locations for some external reason, it
would be trivial to write a configuration which does what you want
with regexp locations as well:

location / {
    ...
}

location ~ ^/app1/ {
    ...
    location ~ \.jpg$ {
        expires 1m;
    }
}

location ~ ^/app2/ {
    ...
    location ~ \.jpg$ {
        expires 1y;
    }
}

location ~ \.jpg$ {
    expires 1w;
}

Though such configurations are usually much harder to maintain in
contrast to ones based on prefix locations.


Maxim D.
http://nginx.org/