Old thread: Cache for non-cookie users and fresh for cookie users

dubstep · February 9, 2012, 8:05am

Picking up an old thread for caching

http://nginx.2469901.n2.nabble.com/Help-cache-or-not-by-cookie-td3124462.html

Igor talks about caching by

“No, currently the single way is:

add the cookie in proxy_cache_key

proxy_cache_key “http://cacheserver$request_uri $cookie_name”;
add “X-Accel-Expires: 0” in response with the cookie.”

But from my understanding of “*X-Accel-Expires” *it expires the cache in
the cache repository as given below

“Sets when to expire the file in the internal Nginx cache, if one is
used.”

Does this not mean that when I set cookie and pass “X-Accel-Expires: 0”
it
expires the cache for the non logged in user too, for that cache key? A
new
cache entry will then have to be created, right?

Should I go with “Cache-Control: max-age=0” approach?

Quintin_P · February 9, 2012, 9:50am

Hello!

On Thu, Feb 09, 2012 at 12:34:33PM +0530, Quintin P. wrote:

proxy_cache_key “http://cacheserver$request_uri $cookie_name”;
cache entry will then have to be created, right?
No. X-Accel-Expires will prevent the particular response from
being cached, but won’t delete existing cache entry.

Should I go with “Cache-Control: max-age=0” approach?

The only difference between “X-Accel-Expires: 0” and
“Cache-Contro: max-age=0” is that the former won’t be passed to
client.

As for the use-case in general (i.e. only use cache for users
without cookie), in recent versions it is enough to do

proxy_cache_bypass $cookie_name;
proxy_no_cache $cookie_name;

I.e.: don’t respond from cache to users with cookie
(proxy_cache_bypass), don’t store to cache responses for users
with cookie (proxy_no_cache).

Moreover, responses with Set-Cookie won’t be cached by default,
too. So basically just placing the above into config is enough,
no further changes to a backend code required.

Maxim D.

Quintin_P · February 10, 2012, 9:35am

On Thu, Feb 9, 2012 at 2:19 PM, Maxim D. [email protected] wrote:

being cached, but won’t delete existing cache entry.

So what should I do to delete a particular cache entry?

Quintin_P · February 10, 2012, 9:57am

Hi,

So what should I do to delete a particular cache entry?

Valentin already pointed you to the solution 2 days ago:
http://labs.frickle.com/nginx_ngx_cache_purge/

Best regards,
Piotr S. < [email protected] >

Quintin_P · February 10, 2012, 3:37pm

On 10 Fev 2012 14h16 WET, [email protected] wrote:

Replying to myself I just remembered that having an embedded variable
holding the value of fastcgi_cache_key/proxy_cache_key would be
awesome. Since that way using, for example, the embedded Lua module we
could easily manage the cache from within Nginx without having to do
additional HTTP requests.

The Nginx cache is simple: just a bunch of files. So IMHO the cache
purging should be as simple as possible and not having to issue
additional requests for purging.

I guess this is a feature request

Thx,
— appa

Quintin_P · February 10, 2012, 3:18pm

On 10 Fev 2012 08h34 WET, [email protected] wrote:

http://nginx.2469901.n2.nabble.com/Help-cache-or-not-by-cookie-td3124462.html

expires the cache for the non logged in user too, for that cache
key? A
new
cache entry will then have to be created, right?

No. X-Accel-Expires will prevent the particular response from
being cached, but won’t delete existing cache entry.

So what should I do to delete a particular cache entry?

The easiest and fastest way is to delete the file. You need to build a
function in a programming language (PHP, Lua, Ruby) that computes the
key:

In PHP:

$filename = md5(‘example.com/foobar’);

This is for a key like:

 $host$request_uri

Then depending on the structure if your cache directory each directory
is named by taking a number of characters from the end of the string:

For the above:

$filename is 536d75ab92a8e8778916a971cb1fb4e0

If the cache dir structure is something like:

proxy_cache_path /var/cache/nginx/microcache levels=1:2
keys_zone=microcache:5M max_size=1G;
^^^
This means that the path to the file containing the cached page will
be:

/var/cache/nginx/microcache/0/4e/536d75ab92a8e8778916a971cb1fb4e0

where:

0 is substr($filename, -1, 1)
^
1 from levels=1:2
4e is substr($filename, -3, 2)
^
2 from levels=1:2

HTH,
— appa

Quintin_P · February 11, 2012, 9:09am

On 10 Fev 2012 14h45 WET, [email protected] wrote:

Turn caching on for POST method request responses as well

for a match in the cache, and if that fails, by passing

proxy_pass $scheme://backend;
content has changed. It’s obvious you do know the content
has changed, but without sharing the details on how and
where the content on the backend is updated, you can’t
expect a more specific solution.

Why all this? The default behavior is for POSTed requests never to be
cached.

http://wiki.nginx.org/HttpProxyModule#proxy_cache_methods

http://wiki.nginx.org/HttpFcgiModule#fastcgi_cache_methods

BTW: Valentin, Maxim, Igor, Andrei, et al

The proxy_cache_methods and fastcgi_cache_methods are missing on the
official docs at
http://nginx.org/en/docs/http/ngx_http_proxy_module.html
and Module ngx_http_fastcgi_module
respectively.

— appa

Quintin_P · February 11, 2012, 9:30am

On Feb 10, 2012, at 7:05 PM, Antnio P. P. Almeida wrote:

Why all this? The default behavior is for POSTed requests never to be
cached.

Module ngx_http_proxy_module

Module ngx_http_fastcgi_module

BTW: Valentin, Maxim, Igor, Andrei, et al

Right, thanks for spotting this one. We’re currently working on sync’ing
it up it all up and making the entire thing less confusing.

Quintin_P · February 11, 2012, 9:44am

10 февраля 2012, 19:06 от António P. P. Almeida [email protected]:

proxy_cache_key $scheme$host$uri;

Pass the POST method request directly to the backend

You’ve been going on about this for two days now, but you still
haven’t managed to explain HOW you check whether the
content has changed. It’s obvious you do know the content
has changed, but without sharing the details on how and
where the content on the backend is updated, you can’t
expect a more specific solution.

Why all this? The default behavior is for POSTed requests never to be
cached.

That’s exactly why all that is necessary - to force each POST method
request
response to be cached, and to refresh the cache immediately.

Let’s say a blog entry is updated (through a POST method request to
“blog?edit=action”) - then all those clients who are reading the blog
(through GET method requests) will be getting the old, stale version
of “/blog” from the cache until the cache validity expires. With my
approach, on the other hand, the cache is refreshed immediately
on every POST method request (because GET method requests
for “/blog” and POST method requests for “/blog?action=edit” are
cached under the same key), so all clients get the latest version
without the cache validity having to be reduced.

Max

Quintin_P · February 11, 2012, 8:49am

10 февраля 2012, 12:35 от Quintin P. [email protected]:

“No, currently the single way is:
“Sets when to expire the file in the internal Nginx cache, if one is

So what should I do to delete a particular cache entry?

If you really want to avoid nginx_ngx_cache_purge at all costs,
you’ll have to use something like this to force every POST
method request to refresh the cache:

proxy_cache zone;

Responses for “/blog” and “/blog?action=edit” requests

are cached under the SAME key

proxy_cache_key $scheme$host$uri;

Turn caching on for POST method request responses as well

proxy_cache_methods GET HEAD POST;

location / {
recursive_error_pages on;
error_page 409 = @post_and_refresh_cache;

# Redirect POST method requests to @post_and_refresh_cache
if ($request_method = POST) { return 409; }

# Process GET and HEAD method requests by first checking
# for a match in the cache, and if that fails, by passing
# the request to the backend
proxy_pass $scheme://backend;

}

location @post_and_refresh_cache {

proxy_cache_bypass "Never check the cache!";

# Pass the POST method request directly to the backend
# and store the response in the cache
proxy_pass $scheme://backend;

}

This generic approach is based on the assumptions that the
content on the backend is posted/modified through the same
frontend that is proxying and caching it, and that you know
how to prevent session-specific information from being leaked
through the cache.

You’ve been going on about this for two days now, but you still
haven’t managed to explain HOW you check whether the
content has changed. It’s obvious you do know the content
has changed, but without sharing the details on how and
where the content on the backend is updated, you can’t
expect a more specific solution.

Max

Quintin_P · February 11, 2012, 10:46am

On 10 Fev 2012 15h41 WET, [email protected] wrote:

Responses for “/blog” and “/blog?action=edit” requests

are cached under the SAME key

proxy_cache_key $scheme$host$uri;

Turn caching on for POST method request responses as well

proxy_cache_methods GET HEAD POST;

You’re saying the POST is a valid method for caching, i.e., POSTed
requests get cached.

proxy_pass $scheme://backend;
}

Here you do an internal redirect to @post_and_refresh_cache via
error_page when the request method is a POST.

location @post_and_refresh_cache {

proxy_cache_bypass “Never check the cache!”;

Pass the POST method request directly to the backend

and store the response in the cache

proxy_pass $scheme://backend;
}

Here you bypass the cache and proxy_pass the request to a backend.

AFAICT you’re replicating the default behaviour which is to not
cache in the case of POST requests. Is it not?

— appa

Quintin_P · February 11, 2012, 11:51am

10 февраля 2012, 20:43 от António P. P. Almeida [email protected]:

On 10 Fev 2012 15h41 WET, [email protected] wrote:

10 февраля 2012, 19:06 от António P. P. Almeida [email protected]:

On 10 Fev 2012 14h45 WET, [email protected] wrote:

Turn caching on for POST method request responses as well

proxy_cache_methods GET HEAD POST;

You’re saying the POST is a valid method for caching, i.e., POSTed
requests get cached.

The proxy_cache_methods directive is used to select the request methods
that will have their responses cached. By default, nginx caches only
GET and HEAD request method responses, which is why I added the
POST request method to the list - so that POST method request responses
would be cached as well. This is necessary if you want to make sure the
cache is instantly refreshed whenever you update something through a
POST method request. Without this directive here, you would update
the content on the backend, but the frontend would hold and serve
the stale content from the cache until its validity expired instead
of the new, updated content that’s on the backend.

proxy_pass $scheme://backend;
}

Here you do an internal redirect to @post_and_refresh_cache via
error_page when the request method is a POST.

Exactly.

location @post_and_refresh_cache {

proxy_cache_bypass “Never check the cache!”;

Pass the POST method request directly to the backend

and store the response in the cache

proxy_pass $scheme://backend;
}

Here you bypass the cache and proxy_pass the request to a backend.

Exactly.

AFAICT you’re replicating the default behaviour which is to not
cache in the case of POST requests. Is it not?

The default behaviour is not to cache POST method request responses,
but I turned caching of POST method request responses ON, so I had
to make sure the cache is bypassed for POST method requests (but
not for GET or HEAD method requests!). All POST method requests
are passed on to the backend without checking for a match in the
cache, but - CONTRARY to the default behavior - all POST method
request responses are cached.

Without the @post_and_refresh_cache location block and without
the proxy_cache_bypass directive, nginx would check the cache
and return the content from the cache (put there by a previous
GET request response, for example) and would not pass the POST
method request on to the backend, which is definitely not what
you want in this case.

Max

Quintin_P · February 11, 2012, 4:28pm

On 10 February 2012 20:47, Max [email protected] wrote:

and return the content from the cache (put there by a previous
GET request response, for example) and would not pass the POST
method request on to the backend, which is definitely not what
you want in this case.

Your config would do what the OP wanted but it would be nicer, I
think, if the POST request simply invalidated the existing cached
content and then for the content to be cached only if and when there
is a GET request for that item. I.E., for the cache validity to start
when there is a request to view the item.
Also avoids using $uri as key which can lead to cache pollution with
frontend controllers etc.

An internal call to a proxy_purge location could do this … maybe as
a post_action. There will be no need for the proxy_bypass

Quintin_P · February 11, 2012, 6:09pm

On 10 Fev 2012 17h47 WET, [email protected] wrote:

the proxy_cache_bypass directive, nginx would check the cache
and return the content from the cache (put there by a previous
GET request response, for example) and would not pass the POST
method request on to the backend, which is definitely not what
you want in this case.

If what the OP wanted was to distinguish between cached POST and GET
request responses then just add $request_method to the cache key.

— appa

Quintin_P · February 11, 2012, 8:57pm

11 февраля 2012, 18:49 от Quintin P. [email protected]:

Can that be done with your approach? Just to invalidate?

No. AFAIK, there is no way to cause forced cache invalidation that
would remove specific cache entries without using 3rd party modules,
such as Piotr S.'s excellent ngx_cache_purge. You should definitely
include that module in your next scheduled nginx upgrade.

In the meantime, you could use something like this to force the
cache contents for “/furniture/desks/” to be refreshed by
sending a request for “/refresh_cache/furniture/desks”:

Responses for “/blog” and “/blog?action=edit” requests

are cached under the SAME key

proxy_cache_key $scheme$host$uri;

location ~ ^/refresh_cache/(.*)$ {

# Change the key to match the existing key
# of the cache entry you want to refresh
proxy_cache_key $scheme$host/$1;

proxy_cache_bypass "Never check the cache!";

# Pass the request directly to the backend
# and store the response in the cache
proxy_pass $scheme://backend/$1;

}

This is just meant to demonstrate the general approach.

Max

Quintin_P · February 11, 2012, 8:03pm

Sorry for being late to respond.

There is so much that’s being discussed that does not reflect in the
wiki –
people like me think the wiki is the canonical document.

I like max’s approach but need to cache only in the next GET. Mostly
because some XMLHTTP post request under the same location directive will
invalidate and recache in this context. But that might not be a
candidate
to recache. E.g. storing page performance counters after a page’s been
loaded.

Can that be done with your approach? Just to invalidate?

I might sound a bit naïve here, but all the different proxy_cache
mechanisms seems to get a bit confusing.

To the reason why I don’t want nginx_ngx_cache_purge: Recompiling and
delivering through the yum repo of a large organization is a cumbersome
process and raises many flags.

-Q

Quintin_P · February 12, 2012, 12:15pm

11 февраля 2012, 02:25 от Nginx U. [email protected]: