How to setup Nginx as REALLY static-cache reverse proxy

Hi,

i love Nginx, but i have some specific problem. Nginx cache depends also
on some browser-specific factors.

In one project, we need to work with Nginx as “Static webpage mirror”
for occasional outages or scheduled downtimes of primary server.

99% visitors just browsing this website and only 1% is working actively
(fills some forms, etc.), so static mirror is for us important feature.
Cookies can be totally ignored.

Setup:

  •      For example domain "domain.com"
    
  •      Production public IP: 1.2.3.4,
    
  •      Primary production server LAN IP (behind NAT): IP 10.0.0.1 
    

(HTTP… only Apache, without Nginx)

  •      Secondary server with Nginx LAN IP (behind NAT): IP 10.0.0.2 
    

(setup as reverse proxy for 10.0.0.1 with configured Nginx cache)

Normal situation:

  •      public IP is NAT-ed to 10.0.0.1
    
  •      on secondary server is in hosts record "10.0.0.2 domain.com 
    

www.domain.com"

  •      on secondary server is crawler job, which every day crawl 
    

whole domain.com including images, styles, etc.

  •      on secondary server is Nginx configured to save cache of all 
    

requests for 48 hours and ignore all cache-control-headers from primary
server

Primary server outage (expected state):

  •      On router, NAT for 1.2.3.4 is changed from primary server IP 
    

10.0.0.1 to secondary server 10.0.0.2

  •      Secondary server properly handle all GET/HEAD request from 
    

its static cache (and in this situation, is for GET/HEAD fully
independent from primary server accessibility)

What is a problem?

Nginx cache works with some other factors and “proxy_cache_key” is not
so unique ID, as i expected :slight_smile:

After i crawl this website by Google Chrome, for Google Chrome, cache
from secondary server works great (all requests are with HIT state). But
when i access to the same domain and same URL from other browser (iOS,
Safari, Firefox, IE, Opera, Wget, Curl, etc.), Nginx cache show in log
“MISS” for these requests and trying to load URL content from primary
server (which is down, so it doesn’t work).

So, this static website works partially and just for some browsers, that
was close to browser/crawler, which was crawling website to load into
nginx cache.

I found, that one of these factors is “Vary” header and after ignoring
this header, it works better. But, there are still some other
factors/header.

Could you help me with it? :slight_smile:

I need to setup Nginx to be independent on browser headers and
write/load cache really just for unique URL and request method.

I know - there are factors like browser capabilities to handle content
encoding, etc. and Nginx need to handle it properly.

I just need to bring best efficiency of this solution to our client. For
example, it’s OK to have this static cache working without gzipping.

Thank you for you help!

Jan


Below is my nginx configuration:

proxy_cache_key "$scheme$request_method$host$request_uri ";
proxy_cache_min_uses 1;
proxy_cache_use_stale error timeout invalid_header updating http_500
http_502 http_503 http_504;
proxy_cache_revalidate off;
proxy_http_version 1.1;
proxy_next_upstream off;
proxy_cache_lock on;

proxy_cache_path /var/lib/nginx/tmp/cache/domain.com levels=1:2
keys_zone=domain_com:32m max_size=15G inactive=2880m loader_files=500
loader_threshold=500;

server {
listen 10.0.0.2:80;
access_log /var/log/nginx/domain.com.access.log main buffer=64k;
error_log /var/log/nginx/domain.com.error.log warn;
root /usr/share/nginx/html;
server_name www.domain.com domain.com;

if ($request_method !~ ^(GET|HEAD)$ ) {
    return 503;
}

location / {
  proxy_cache domain_com;
  proxy_pass http://10.0.0.1:80;

  proxy_connect_timeout 3s;
  proxy_read_timeout 3s;
  proxy_send_timeout 3s;

  proxy_cache_valid any 2880m;

  proxy_ignore_headers Set-Cookie X-Accel-Expires Expires 

Cache-Control Vary;

  proxy_hide_header "Cache-Control";
  proxy_hide_header "Set-Cookie";
  proxy_hide_header "Vary";

  proxy_set_header Host $host;
  proxy_set_header X-Real-IP $remote_addr;
  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  proxy_set_header HTTP_REMOTE_ADDR $remote_addr;
  proxy_set_header REMOTE_ADDR $remote_addr;

  proxy_set_header Accept-Encoding ""; # Deny compression in Apache

  add_header X-Proxy-Cache $upstream_cache_status;
}

}

I’m having the same issue with cache being browser dependent. I’ve tried
setting up a crawl job using wget --recursive with Firefox and Chrome
headers, but that doesn’t seem to trigger server-side caching either.

If I browse the site using Firefox, then caching works for Firefox, and
Firefox only.

If I browse the site using Chrome, then caching works for Chrome, and
Chrome
only.

I’d like cache to be browser agnostic.

Any ideas?

Posted at Nginx Forum:

Can you show us your config, debug logs, or any info that would help
troubleshoot the issue? See

for
help on setting up debug logging.

I’m having the same issue with cache being browser dependent. I’ve tried
setting up a crawl job using wget --recursive with Firefox and Chrome
headers, but that doesn’t seem to trigger server-side caching either.

I’d like cache to be browser agnostic.

Any ideas?

It is hard to identify problems without getting a look at the
configuration
but for the sake of general ideas:

For one particular case of static file caching we use nginx’s
proxy_store
feature (
Module ngx_http_proxy_module ) -
the
files are literally stored as is (in the same tree structure/naming) on
the
cache server.

The configuration is way simple:

proxy_ignore_client_abort on; // optional

upstream backend {
server 10.10.10.10;
}

server {
root /path;
error_page 404 = @store;

 location @store {
    internal;
    proxy_pass           http://imgstore;
    proxy_store          on;
 }

}

Obviously there are some drawbacks (or advantages in some cases) in
caching
this way - the expire headers (or any other headers for that matter)
from
the backend (or the client) have no effect on the cache server, the
cache
management (purging items / used space) is fully your responsibility,
404
(or any other non-200) responses are not cached (which might or might
not be
a problem).

rr

Thanks for the responses guys.

I’ve tried proxy_store on one config, but now I’m just receiving
time-outs
when I block the origin server. No stale cache on error at all.

Here are two separate configs I’m using. The first one is as described
earlier, with caching and stale cache errors working, although cache is
browser dependent.

proxy_cache_path /etc/nginx/cache/abc123.org levels=1:2
keys_zone=abc123:64m
inactive=10d max_size=1000m;
server {
listen 443 ssl;
server_name abc123.org;
access_log /var/log/nginx/abc123.org.access.log;
error_log /var/log/nginx/abc123.org.error.log;
include ssl.conf;
#moved from location
proxy_cache_key abc123$request_uri;
location / {
proxy_cache abc123;
#proxy_cache_key abc123$request_uri;
add_header X-Proxy-Cache $upstream_cache_status;
proxy_pass https://abc123.org;
proxy_cache_valid 200 720m;
proxy_cache_valid 301 304 302 720m;
proxy_cache_use_stale error timeout invalid_header updating http_500
http_502 http_503 http_504 http_404;
expires max;
add_header Cache-control “public”;
proxy_connect_timeout 3s;
proxy_read_timeout 3s;
proxy_send_timeout 3s;
proxy_cache_revalidate on;
proxy_cache_min_uses 1;
}
}
server {
listen 80;
server_name abc123.org;
return 301 https://$server_name$request_uri;
}

Config with proxy_store

proxy_cache_path /etc/nginx/cache/zxf123.org levels=1:2
keys_zone=zxf123:64m
inactive=10d max_size=1000m;
server {
listen 443 ssl;
server_name zxf123.org;
access_log /var/log/nginx/zxf123.org.access.log;
error_log /var/log/nginx/zxf123.org.error.log;
include ssl.conf;
#moved from location
proxy_cache_key $host$request_uri;
location / {
#proxy_cache zxf123;
proxy_store on;
#proxy_cache_key $host$request_uri;
add_header X-Proxy-Cache $upstream_cache_status;
proxy_pass https://zxf123.org;
proxy_cache_valid 200 720m;
proxy_cache_valid 301 304 302 720m;
proxy_cache_use_stale error timeout invalid_header updating http_500
http_502 http_503 http_504 http_404;
expires max;
add_header Cache-control “public”;
proxy_connect_timeout 3s;
proxy_read_timeout 3s;
proxy_send_timeout 3s;
proxy_cache_revalidate on;
proxy_cache_min_uses 1;
}
}
server {
listen 80;
server_name zxf123.org;
return 301 https://zxf123.org$request_uri;
}

Posted at Nginx Forum: