Sending 404 responses for epty objects

Hello,

Due to issues with a backend beyond my influence i need to fix this with
Nginx.

Root-Cause: A CMS generates empty files on a filesystem which will be
later
filled with content. However: those files are there for some time with 0
bytes
and will be served with 200 through a chain of a caching Nginx and a
caching
CDN.

User → CDN → Caching Nginx (SlowFS) → Serving Nginx →
Filesystem.

The Backend-Filesystem is served by a Nginx. Solution could be to serve
a
404
for all empty files. I tried with $sent_http_content_length, but it
seems
to
be empty in the location where i would need it. Is it possible to use
some
kind of ‘test -s’ on the file to decide when to send a 404?

I also tried to tackle it in the caching SlowFS-Nginx, by using
$upstream_http_content_length in if or map-statements [1]. I can see
$upstream_http_content_length set in an X-Debug-Header i added, but
can’t
get
it to work to use it to act on it.

if ($upstream_http_content_length = 0) {
return 404;
}

I found the discussion about ‘using $upstream* variables inside map
directive’[2]
but even if i would get this to work it wouldn’t be enough as i need to
signal the CDN in front (with some headers) that it also should not
cache
the
empty response. Best would be to generate a 404 if the upstream-object
is
empty.

So maybe someone here has an idea how to tackle this.

Cord

[1] like in
http://syshero.org/post/49594172838/avoid-caching-0-byte-files-on-nginx
[2] Re: using $upstream* variables inside map directive

Posted at Nginx Forum:

On the ‘Serving Nginx’ I’d use Lua to test for a zero byte file and
return
the 404 there.

Posted at Nginx Forum:

Hello!

On Thu, Dec 11, 2014 at 03:54:25AM -0500, Cord Beermann wrote:

User → CDN → Caching Nginx (SlowFS) → Serving Nginx → Filesystem.

The Backend-Filesystem is served by a Nginx. Solution could be to serve a
404
for all empty files.

Please note that this is not a solution, as at some point files
will be partially filled with content, and testing that the size
isn’t 0 won’t help. Rather, it’s a workaround which hides the
problem in some cases.

The only solution I see is to fix backend to update files
atomically - e.g., write to a temporaty file, and then rename()
it to a real name.

I tried with $sent_http_content_length, but it seems to
be empty in the location where i would need it. Is it possible to use some
kind of ‘test -s’ on the file to decide when to send a 404?

This is something possible with embedded perl (and lua, as already
suggested).

I also tried to tackle it in the caching SlowFS-Nginx, by using
$upstream_http_content_length in if or map-statements [1]. I can see
$upstream_http_content_length set in an X-Debug-Header i added, but can’t
get
it to work to use it to act on it.

if ($upstream_http_content_length = 0) {
return 404;
}

This is not going to work as “if” will be executed before the
request is sent to the upstream server.


Maxim D.
http://nginx.org/

Hi Maxim,
should this solution work?
http://syshero.org/post/49594172838/avoid-caching-0-byte-files-on-nginx

I have created a simple test setup like:

map $upstream_http_content_length $flag_cache_empty {
default 0;
0 1;
}

server {
listen 127.0.0.1:80;

server_name   local;

location /empty {
    return 200 "";
}
location /full {
    return 200 "full";
}

}

server {
listen 127.0.0.1:80;

server_name   cache;

location / {
    proxy_pass http://127.0.0.1;
    proxy_cache_valid 200 404 1h;
    proxy_no_cache $flag_cache_empty;
    proxy_cache_bypass $flag_cache_empty;
    proxy_set_header Host local;
    add_header X-Cache-Status $upstream_cache_status;
    add_header X-Cache-Empty $flag_cache_empty;
    add_header X-Upstream-Content-Length 

$upstream_http_content_length;
}
}

But the flag is always 0:
vagrant@nginx-16-centos-64 bin]$ curl -v -H “Host: cache”
http://localhost/empty

  • About to connect() to localhost port 80 (#0)
  • Trying 127.0.0.1… connected
  • Connected to localhost (127.0.0.1) port 80 (#0)

GET /empty HTTP/1.1
User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7
NSS/3.15.3 zlib/1.2.3 libidn/1.18 libssh2/1.4.2
Accept: /
Host: cache

< HTTP/1.1 200 OK
< Server: nginx
< Date: Mon, 04 May 2015 11:37:51 GMT
< Content-Type: application/octet-stream
< Content-Length: 0
< Connection: keep-alive
< X-Cache-Status: MISS
< X-Cache-Empty: 0
< X-Upstream-Content-Length: 0
<

  • Connection #0 to host localhost left intact
  • Closing connection #0

Posted at Nginx Forum:

Hello!

On Mon, May 04, 2015 at 07:52:46AM -0400, philipp wrote:

}
    proxy_no_cache $flag_cache_empty;
    proxy_cache_bypass $flag_cache_empty;

Removing proxy_cache_bypass should fix things for you.

The problem is that proxy_cache_bypass will be evaluated before a
request is sent to upstream and therefore before
$upstream_http_content_length will be available. As a result
$flag_cache_empty will be always 0. And, because map results are
cached for entire request lifetime, proxy_no_cache will see the
same value, 0.


Maxim D.
http://nginx.org/

Thanks for your help, removing the bypass solved this issue for me. This
feature request would simplify such configurations:

Posted at Nginx Forum: