Forum: NGINX Trying to configure an origin pull CDN-like reverse proxy

2974d09ac2541e892966b762aad84943?d=identicon&s=25 cachito (Guest)
on 2012-11-03 05:17
(Received via mailing list)
Hello, I'm hosting a group of Wordpress blogs with about 200k visits and
millions of hits per day. MySql + PHP live in a server (beefy VPS) and I
placed a reverse proxy in front of it to cache most of the requests.

Now I want to offload all the static files to a third server, taking
advantage of a feature of common Wordpress cache plugins, that rewrites
static file URLs for origin-pull CDN services. This way, an original URL
http://blog.com/wp-content/uploads/photo.jpg is rewritten as
http://cdn.url.com/wp-content/uploads/photo.jpg and this server requests
the
file form the original server, caches it and then serves it directly,
for
the duration of the 1st server's Expires header/directive.

I thought it would be easy to use the proxy_* features, but I'm hitting
a
wall and I can't find an applicable tutorial/article anywhere. Would
somebody have any advice on how to do this? This is the basic behavior
I'm
after:
- Client requests static file cdn.blog.com/dir/photo.jpg
- cdn.blog.com looks for the file in its cache
- If the cache has it, check original or revalidate according with
original
headers (this is internal, I know).
- If the cache doesn't have it, request it from
www.blog.com/dir/photo.jpg,
cache it and serve it.
- Preferably, allow for this to be done for many sites/domains, acting
as a
CDN server for many sites.

This is my conf:
The cache zones in otherwise default nginx.conf and before including
conf.d/*.conf (I'm on CentOS 6.3 with nginx 1.0.15 from EPEL)

proxy_cache_path /var/www/cache/images levels=1:2
keys_zone=images:200m
max_size=10g inactive=3d;

proxy_cache_path /var/www/cache/scripts levels=1:2
keys_zone=scripts:50m
max_size=10g inactive=3d;

proxy_cache_path /var/www/cache/pages levels=1:2
keys_zone=pages:200m
max_size=10g inactive=3d;

And this is the individual server config on conf.d/server1.conf

upstream backend_cdn.blog.com {
ip_hash;
server 333.333.333.333;
}

server {
listen 80;
server_name cdn.blog.com;
access_log off;
# Set proxy headers for the passthrough
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

# Let the Set-Cookie and Cache-Control headers through.
proxy_pass_header Set-Cookie;
proxy_pass_header Cache-Control;
proxy_pass_header Expires;

# Fallback to stale cache on certain errors.
# 503 is deliberately missing, if we're down for maintenance
# we want the page to display.
proxy_cache_use_stale error
timeout
invalid_header
updating
http_500
http_502
http_504
http_404;

# Set the proxy cache key
set $cache_key $scheme$host$uri$is_args$args;

location / {
proxy_pass http://backend_$host;
proxy_cache pages;
proxy_cache_key $cache_key;
proxy_cache_valid 15m; # 200, 301 and 302 will be cached.
# 2 rules to dedicate the no caching rule for logged in users.
# proxy_cache_bypass $wordpress_auth; # Do not cache the response.
# proxy_no_cache $wordpress_auth; # Do not serve response from cache.
add_header X-Cache $upstream_cache_status;
}

location ~* \.(png|jpg|jpeg|gif|ico|swf|flv|mov|mpg|mp3)$ {
expires max;
log_not_found off;
proxy_pass http://backend_$host;
proxy_cache images;
proxy_cache_key $cache_key;
}

location ~* \.(css|js|html|htm)$ {
expires 7d;
log_not_found off;
proxy_pass http://backend_$host;
proxy_cache scripts;
proxy_cache_key $cache_key;
}
}

With this configuration, whenever I call a static file such as
http://cdn.blog.com/wp-includes/js/prototype.js I end up being
redirected to
http://www.blog.com/wp-includes/js/prototype.js. I've tried many things,
like setting the Host header to various values or adding $uri to the end
of
the proxy_pass directives, to no avail. One thing to notice is that the
333.333.333.333 server only responds to www.blog.com, not cdn.blog.com.

Do I need a root directive in server1.conf?

I'm running in circles, any help will be much appreciated.

Thanks in advance,
Cachito Espinoza

Posted at Nginx Forum:
http://forum.nginx.org/read.php?2,232468,232468#msg-232468
36a8284995fa0fb82e6aa2bede32adac?d=identicon&s=25 Francis Daly (Guest)
on 2012-11-03 10:44
(Received via mailing list)
On Sat, Nov 03, 2012 at 12:16:46AM -0400, cachito wrote:

Hi there,

All untested by me, but...

> - Preferably, allow for this to be done for many sites/domains, acting as a
> CDN server for many sites.

So far, it looks like a straightforward caching reverse proxy setup. I'm
not quite sure what the last point means -- but one server{} block per
site should work.

> proxy_set_header Host $host;
$host here is probably "cdn.blog.com".

What happens if you change this to "proxy_set_header Host www.blog.com;"
?

> location ~* \.(css|js|html|htm)$ {
> expires 7d;
> log_not_found off;
> proxy_pass http://backend_$host;
> proxy_cache scripts;
> proxy_cache_key $cache_key;
> }

> With this configuration, whenever I call a static file such as
> http://cdn.blog.com/wp-includes/js/prototype.js I end up being redirected to
> http://www.blog.com/wp-includes/js/prototype.js. I've tried many things,
> like setting the Host header to various values or adding $uri to the end of
> the proxy_pass directives, to no avail. One thing to notice is that the
> 333.333.333.333 server only responds to www.blog.com, not cdn.blog.com.

What is the output of

  curl -i -0 -H 'Host: cdn.blog.com'
http://333.333.333.333/wp-includes/js/prototype.js

? That is approximately what nginx will do. (You can add the extra
proxy_set_header headers there, if you think it will make a difference.)

My guess is that the 333.333.333.333 server returns the http redirect,
and nginx is correct in passing that on to the client.

The nginx log files should show more details.

> Do I need a root directive in server1.conf?

If you read from the filesystem, or otherwise access $document_root,
then the root directive is used.

I don't see that needed for this request.

Good luck with it,

  f
--
Francis Daly        francis@daoine.org
Please log in before posting. Registration is free and takes only a minute.
Existing account

NEW: Do you have a Google/GoogleMail, Yahoo or Facebook account? No registration required!
Log in with Google account | Log in with Yahoo account | Log in with Facebook account
No account? Register here.