Proxy_store help requested

Hello. I have two nginx instances - let’s call them upstream and
downstream.
Both are running Ubuntu 13.10 64-bit and nginx 1.4.1. I want to use
proxy_store to mirror some rarely-changing files from upstream to
downstream.

On the downstream server, I have created a /var/www directory owned by
www-data (the user configured to run worker nginx processes). All files
are
served out of this directory. The directory (and its sub-dirs) have 755
permissions.

In theory, when I ask for a file from the downstream server, my
understanding is that it should look under /var/www for it; upon not
finding
it, get it from upstream and store it locally in downstream; and then
serve
the file from downstream on an on-going basis. The upstream server
should
only show one access in its access log.

This is not happening. The downstream server keeps complaining that the
file
cannot be found locally, and continually fetches the file from upstream
instead. So each access attempt to downstream for that file generates
one
“no such file or directory” error in the downstream error log, and a
regular
GET in the upstream access log.

If I instead touch a file at the location (as the www-data user) where
nginx
wants to find the file locally on the downstream server; do a GET for
that
file; and then delete the file, nginx will do the right thing (i.e., get
the
file from upstream, store it at that location, and then serve it). If I
skip the GET, nginx continues to not save the file locally, and keeps
getting it each time from upstream.

Any idea what’s going on?

Here’s my downstream server’s config:

upstream download_servers {
server download.foobar.com;
}

server {
listen 80;
server_name www.foobar.com;

location / {
    root /var/www;
    index index.html;
    proxy_redirect off;
}

location /download/ {
    root /var/www/download/fetch/;
    error_page 404 = /fetch$url;
}

location /fetch/ {
    internal;
    proxy_store /var/www/download${uri};
    proxy_http_version 1.1;
    proxy_pass http://download_servers;
    proxy_store_access user:rw group:rw all:r;
}

}

Posted at Nginx Forum:

OK, I just tracked this down to whether the proxy_pass value refers to a
load-balancing upstream collection (as I’m doing above) vs. a hard-coded
reference to one server. So, if I change the proxy_pass config value to
refer to http://download.foobar.com instead of http://download_servers,
everything works.

Is this a known limitation, or a bug that I should file?

Posted at Nginx Forum:

Hello!

On Sat, Mar 08, 2014 at 06:21:14PM -0500, nginx_newbie_too wrote:

OK, I just tracked this down to whether the proxy_pass value refers to a
load-balancing upstream collection (as I’m doing above) vs. a hard-coded
reference to one server. So, if I change the proxy_pass config value to
refer to http://download.foobar.com instead of http://download_servers,
everything works.

Is this a known limitation, or a bug that I should file?

The difference between “download.foobar.com” and
“download_servers” is a name that will be used in requests to an
upstream (see Module ngx_http_proxy_module). And if your
upstream sever responds with an error due to incorrect name used,
nginx will not store the response.

That is, from the information you provided what you describe looks
like a result of a misconfiguration.


Maxim D.
http://nginx.org/

Maxim, thank you for the prompt response. I am entirely willing to
believe
that this is a misconfiguration, but I cannot figure out what I’ve
misconfigured. The upstream server shows no errors in its error log; its
access log does a 200 for the first GET, and 304’s for subsequent GETs.
The
downstream server continues to log errors about no such file or
directory,
and the file doesn’t exist in the expected location. The behavior
remains
consistent even if I set proxy_set_header to $host on the downstream
server.

What should I be looking at to track this down further? I’m happy to
look at
(or post here) any logs or headers or configuration.

Posted at Nginx Forum:

More details:

nginx version: nginx/1.4.1 (Ubuntu)
TLS SNI support enabled
configure arguments: --prefix=/usr/share/nginx
–conf-path=/etc/nginx/nginx.conf
–error-log-path=/var/log/nginx/error.log
–http-client-body-temp-path=/var/lib/nginx/body
–http-fastcgi-temp-path=/var/lib/nginx/fastcgi
–http-log-path=/var/log/nginx/access.log
–http-proxy-temp-path=/var/lib/nginx/proxy
–http-scgi-temp-path=/var/lib/nginx/scgi
–http-uwsgi-temp-path=/var/lib/nginx/uwsgi
–lock-path=/var/lock/nginx.lock
–pid-path=/run/nginx.pid --with-pcre-jit --with-debug
–with-http_addition_module --with-http_dav_module
–with-http_geoip_module
–with-http_gzip_static_module --with-http_image_filter_module
–with-http_realip_module --with-http_stub_status_module
–with-http_ssl_module --with-http_sub_module --with-http_xslt_module
–with-ipv6 --with-mail --with-mail_ssl_module
–add-module=/build/buildd/nginx-1.4.1/debian/modules/nginx-auth-pam
–add-module=/build/buildd/nginx-1.4.1/debian/modules/nginx-dav-ext-module
–add-module=/build/buildd/nginx-1.4.1/debian/modules/nginx-echo
–add-module=/build/buildd/nginx-1.4.1/debian/modules/nginx-upstream-fair
–add-module=/build/buildd/nginx-1.4.1/debian/modules/ngx_http_substitutions_filter_module

Linux 3.11.0-18-generic #32-Ubuntu SMP Tue Feb 18 21:11:14 UTC 2014
x86_64
x86_64 x86_64 GNU/Linux

Posted at Nginx Forum:

The 304 response from the upstream server ended up being the culprit. If
I
changed the upstream server to have 'if_modified_since off;" and thus
always
respond with a 200 and the content, the problem is resolved. To freshen
the
mirror, I can then simply remove the mirrored content from the
downstream
server; no nginx processes even need to be restarted.

Maxim, this may be obvious to you, but it wasn’t to me, and no
documentation
pointed me in this direction. As a suggestion, a small note about the
significance of setting i-m-s off on upstream servers in such mirroring
situations in the documentation about proxy_store would be helpful.

As always, I’m extremely grateful for your work, and for the others that
provide this awesome software.

Posted at Nginx Forum:

Maxim, one last piece of advice requested. Would it be more proper to
turn
off i-m-s in the request body (by setting proxy_pass_request_headers to
off
in the downstream server configuration) instead of turning it off on the
upstream server? I think that’s more correct behavior, but I’m not
sure.

Yes, proxy_cache simply works out of the box, and it’s awesome. But I
couldn’t understand how to use it so that the downstream server doesn’t
naively GET content again from the upstream after the expiration time
period
had passed. I would have wanted instead to only have the cache refreshed
if
i-m-s suggested that the upstream content had changed.

Posted at Nginx Forum:

Hello!

On Sun, Mar 09, 2014 at 10:34:08PM -0400, nginx_newbie_too wrote:

The 304 response from the upstream server ended up being the culprit. If I
changed the upstream server to have 'if_modified_since off;" and thus always
respond with a 200 and the content, the problem is resolved. To freshen the
mirror, I can then simply remove the mirrored content from the downstream
server; no nginx processes even need to be restarted.

So the actual problem was incorrect testing, not a
misconfiguration. And yes, proxy_store only stores 200 responses
and nothing more, so anything else won’t be stored, including 304.
This is generally good enough, as 304 doesn’t contain response
body and hence doesn’t imply much traffic.

Maxim, this may be obvious to you, but it wasn’t to me, and no documentation
pointed me in this direction. As a suggestion, a small note about the
significance of setting i-m-s off on upstream servers in such mirroring
situations in the documentation about proxy_store would be helpful.

  1. This is not something I would recommend to do, at least not
    something to be done in general.

  2. Note that proxy_store is something very basic. It can be used
    to do powerful things, but if you are looking for something “ready
    to use out of the box” - consider using proxy_cache instead.


Maxim D.
http://nginx.org/

Hello!

On Tue, Mar 11, 2014 at 12:46:18PM -0400, nginx_newbie_too wrote:

Maxim, one last piece of advice requested. Would it be more proper to turn
off i-m-s in the request body (by setting proxy_pass_request_headers to off
in the downstream server configuration) instead of turning it off on the
upstream server? I think that’s more correct behavior, but I’m not sure.

Something like

proxy_set_header If-Modified-Since "";
proxy_set_header If-None-Match "";

on a frontend should be a good way to disable If-* in requests to
an upstream server. I would recommend to don’t touch it at all
though, and just ignore 304 responses which are not stored.
Number of 304 returned by upstream should be small enough.

Yes, proxy_cache simply works out of the box, and it’s awesome. But I
couldn’t understand how to use it so that the downstream server doesn’t
naively GET content again from the upstream after the expiration time period
had passed. I would have wanted instead to only have the cache refreshed if
i-m-s suggested that the upstream content had changed.

To make proxy_cache behave more like proxy_store and ignore
response expiration, use proxy_ignore_headers (and
proxy_cache_valid to set cache time):

proxy_ignore_headers Cache-Control Expires;
proxy_cache_valid 200 365d;

Use of If-Modified-Since to revalidate cached data can be
activated by proxy_cache_revalidate directive.

See here for documentation:

http://nginx.org/r/proxy_set_header
http://nginx.org/r/proxy_ignore_headers
http://nginx.org/r/proxy_cache_valid
http://nginx.org/r/proxy_cache_revalidate


Maxim D.
http://nginx.org/