Serving stale items from cache

I have a web application where the objects are expensive to generate,
yet each individual object is accessed infrequently (a few times a
week). The dataset in total is about 24 GB.

Can Nginx be configured to cache these items for a short amount of
time (say 20 minutes), but also have a relatively long (i.e. 7 days)
stale setting so clients will always get a fast response (but at the
same time will trigger an update of the cached item).

proxy_cache_path /var/lib/nginx/cache levels=2:2:2
keys_zone=staticfilecache:512m inactive=7d max_size=25000m;

serve stale cache when the cache is updating, or there is a timeout or

error.
proxy_cache_use_stale timeout updating error invalid_header http_500
http_502 http_503 http_504;
proxy_cache_valid 200 20m;

My understanding is that the by setting proxy_cache_path to
inactive=7d, cached items will be kept for 7 days and available to be
served as stale objects. Basically any request that comes in after 20
minutes will get a stale object from the cache for up to 7 days (and
at the same time refresh the cache with the latest version from the
backend).

However, in testing, it appears that stale objects are not returned,
but instead a fresh request is made to my appservers. Here is some
sample data from my logs (Notice the $upstream_cache_status and
$request_time at the end of the log line)

71.111.3.11 - - [31/Aug/2011:19:44:26 +0000] “GET /?p=3093 HTTP/1.1”
200 3849 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_1)
AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.215
Safari/535.1” - EXPIRED 15.789
71.111.3.11 - - [31/Aug/2011:19:44:42 +0000] “GET /?p=3093 HTTP/1.1”
200 3849 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_1)
AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.215
Safari/535.1” - HIT 0.000
71.111.3.11 - - [31/Aug/2011:19:51:02 +0000] “GET /?p=3093 HTTP/1.1”
200 3849 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_1)
AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.215
Safari/535.1” - HIT 0.000
71.111.3.11 - - [31/Aug/2011:20:01:50 +0000] “GET /?p=3093 HTTP/1.1”
200 3849 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_1)
AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.215
Safari/535.1” - HIT 0.000
71.111.3.11 - - [31/Aug/2011:21:03:56 +0000] “GET /?p=3093 HTTP/1.1”
200 3851 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_1)
AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.218
Safari/535.1” - EXPIRED 0.725
71.111.3.11 - - [31/Aug/2011:21:14:55 +0000] “GET /?p=3093 HTTP/1.1”
200 3851 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_1)
AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.218
Safari/535.1” - HIT 0.000
71.111.3.11 - - [31/Aug/2011:21:54:18 +0000] “GET /?p=3093 HTTP/1.1”
200 3833 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_1)
AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.218
Safari/535.1” - EXPIRED 0.856
71.111.3.11 - - [31/Aug/2011:22:01:25 +0000] “GET /?p=3093 HTTP/1.1”
200 3833 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_1)
AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.218
Safari/535.1” - HIT 0.000
71.111.3.11 - - [01/Sep/2011:00:02:01 +0000] “GET /?p=3093 HTTP/1.1”
200 3847 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_1)
AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.218
Safari/535.1” - MISS 11.019

Basically I’m trying to keep my cache warm, without overwhelming my
app servers. Is this doable?

Thanks!
Erik

Erik,

That’s not currently doable with nginx. Proxy_cache_use_stale directive
works when you’ve got your backend down.

What you want is more like doing a triggered update of the content in
background and it’s not possible at this time.

Hope this helps

I would love to see this too in Nginx.

The pattern is described here: Two HTTP Caching Extensions

The only software I’ve seen this implemented in is Squid 2.7
(stale-while-revalidate).

From time to time I see backends taking 800ms+ to generate a page at
low traffic. It would be nice to hide most of these requests to the end
user.

Posted at Nginx Forum:

proxy_cache_use_stale updating does soemthing similar.

You may could do something like (from the docs)

proxy_cache_bypass $http_my_secret_header;

without a corresponding proxy_no_cache and that should make nginx go to
the origin server but cache the results.

–Brian

I ended up using Varnish for this. I set a long TTL for the cache,
and then another long “grace” period for those expired objects. Then
I use a script to loop through all of my URLs and rewarm the cache
every few days.

https://www.varnish-cache.org/trac/wiki/VCLExampleHashAlwaysMiss

Seems to be working.