Custom response codes and content-types for memcache module


#1

Hi,

I have recently written a patch which allows storage of custom HTTP
response codes and Content-type headers within memcache alongside the
actual response body. This allows us to pass the not_found’s of the
memcache module to a PHP script which can serve and store different
kinds of responses in the cache.

The chosen approach is quite rough for this first version. The memcache
module is able to interpret two special commands prepended to the
response body.

This example serves a custom 404 page from memcached:

^SET-STATUS:404 Not Found|…

This example serves /sitemap.xml from memcached:

^SET-CONTENT-TYPE:application/xml; charset=UTF8|<?xml version=“1.0”
encoding=“UTF-8” ?><urlset…

Please let me know if there is any interest in this patch. If so, I
could make it available. Without any warranties ofcourse. :stuck_out_tongue:


#2

Very interesting !

2008-07-22

Chancey

发件人: M. Van der klip
发送时间: 2008-07-22 17:57:37
收件人: removed_email_address@domain.invalid
抄送:
主题: Custom response codes and content-types for memcache module

Hi,

I have recently written a patch which allows storage of custom HTTP
response codes and Content-type headers within memcache alongside the
actual response body. This allows us to pass the not_found’s of the
memcache module to a PHP script which can serve and store different
kinds of responses in the cache.

The chosen approach is quite rough for this first version. The memcache
module is able to interpret two special commands prepended to the
response body.

This example serves a custom 404 page from memcached:

^SET-STATUS:404 Not Found| …

This example serves /sitemap.xml from memcached:

^SET-CONTENT-TYPE:application/xml; charset=UTF8| <?xml version=“1.0”
encoding=“UTF-8” ? > <urlset…

Please let me know if there is any interest in this patch. If so, I
could make it available. Without any warranties ofcourse. :stuck_out_tongue:

Posted via http://www.ruby-forum.com/.


#3

Hello!

On Tue, Jul 22, 2008 at 11:50:51AM +0200, M. Van der klip wrote:

response body.

This example serves a custom 404 page from memcached:

^SET-STATUS:404 Not Found|…

I don’t think it’s good idea to introduce brand-new format just
for this. Why not use plain http headers as proxy module does (or
http headers with CGI/1.1 directives as fastcgi does)? This
particular reply can be simplified to:

[cut here]
Status: 404 Not Found

... [cut here]

Maxim D.


#4

Maxim D. wrote:

I don’t think it’s good idea to introduce brand-new format just
for this. Why not use plain http headers as proxy module does (or
http headers with CGI/1.1 directives as fastcgi does)? This
particular reply can be simplified to:

[cut here]
Status: 404 Not Found

... [cut here]

I agree with you absolutely. This first version was mainly a hack to
allow an already-over-deadline site to go live. I have been thinking
about either storing the headers in a seperate memcache key or
prepending it like you mentioned by you above, but wanted the patch to
be as simple as possible for now.


#5

On Tue, Jul 22, 2008 at 02:18:50PM +0400, Maxim D. wrote:

The chosen approach is quite rough for this first version. The memcache
particular reply can be simplified to:

[cut here]
Status: 404 Not Found

... [cut here]

I planned to add usual headers in memcached and mark them using flag.


HTTP/1.0 200 OK
Content-Type: application/xml; charset=UTF8

The flag probaly may be omitted, then nginx will look for “HTTP/” line
in response and treat the response as HTTP/0.9 (simple body) if
it has no “HTTP/”.


#6

Igor S. wrote:

I planned to add usual headers in memcached and mark them using flag.


HTTP/1.0 200 OK
Content-Type: application/xml; charset=UTF8

The flag probaly may be omitted, then nginx will look for “HTTP/” line
in response and treat the response as HTTP/0.9 (simple body) if
it has no “HTTP/”.

That sounds very good! Any short-term plans for this?


#7

Igor, I am asking about the short-term plans because I want to avoid
multiple people working on the same thing. I guess we would like to
start working on an improved version of this patch within now and a few
weeks. That would be a waste of time if you have plans to work on this
as well soon.

An alternative would be for you to dump your ideas towards us and let us
work on the patch, after which we give the patch back to you to
integrate into Nginx. We’d need some do’s and don’ts from you to make
sure we deliver a nice and clean patch.

Please let me know what your plans are. I will not release the current
version of this patch if we can work on an improved version during the
coming weeks.


#8

Igor, I made a new version of my memcache patch which does the
following:

  1. Change the internal response code for a “not found” from 404 to 503.
    This is because we need to be able to send 404’s from the cache.

  2. Scan for ‘HTTP/’ at the start of the memcached reponse and set the
    status and status_line accordingly.

  3. Scan for the ‘Content-Type:’ header and set this header accordingly.

Support for more headers is planned, but I’d appreciate your feedback
first. Thanks.


#9

Hi Maxim, thanks for your feedback.

This part looks plain wrong for me. If there is no data in
memcached - it’s 404 Not Found, not 503 Service Unavailable.

I dare to challenge that. At least when it comes to returning such an
HTTP response code back to clients. Memcached is a cache and so
ultimately does not know if an object exists or not. It only knows
whether it’s currently available.

Consider a site which is almost fully served from memcached. Now what
happens if we flush the cache? Should we return 404’s? What about search
engines? They will remove the pages from the index because we say the
page does not exist. IMO in most situations the memcache module should
indicate a temporary unavailability instead of a permanent one.

I agree that hardcoding it to 503 is not a good option too. What if I’d
make it configurable and let it default to 404?

Something other should be used to distinguish between 404 stored in
memcached and 404 generated due to key not found in memcached.

That’s another option. A variable could be set for example, but that may
be even more a hack than above. Syntacticly it would be a challenge too.
I don’t think this is allowed in the current syntax:

if ($memcached_not_found) {
error_page 404 = /not_found_handler$request_uri;
}

Just making the memcached_not_found response code configurable would be
better IMO. What do you think?

Why not just reuse upstream functionality as fastcgi/proxy modules
do?

Because I’m not an experienced Nginx hacker. :wink: But I appreciate the
hint.

I don’t fully understand what’s going on with the
‘upstream.pass_request_headers’ in the fastcgi and proxy modules, but
ofcourse it would be a better approach to reuse than to reinvent. I will
have a look and see if I can do that.


#10

Hello!

On Wed, Aug 13, 2008 at 04:33:21PM +0200, Spil G. wrote:

Igor, I made a new version of my memcache patch which does the
following:

  1. Change the internal response code for a “not found” from 404 to 503.
    This is because we need to be able to send 404’s from the cache.

This part looks plain wrong for me. If there is no data in
memcached - it’s 404 Not Found, not 503 Service Unavailable.

Something other should be used to distinguish between 404 stored in
memcached and 404 generated due to key not found in memcached.

  1. Scan for ‘HTTP/’ at the start of the memcached reponse and set the
    status and status_line accordingly.

  2. Scan for the ‘Content-Type:’ header and set this header accordingly.

Why not just reuse upstream functionality as fastcgi/proxy modules
do?

Maxim D.


#11

Spil G. wrote:

I don’t fully understand what’s going on with the
‘upstream.pass_request_headers’ in the fastcgi and proxy modules, but
ofcourse it would be a better approach to reuse than to reinvent. I will
have a look and see if I can do that.

Hi Maxim,

I have done my best (see attached patch), but somehow Nginx doesn’t like
the ‘ngx_list_push(&u->headers_in.headers)’ code I copied from the
fastcgi module. The code in question is commented out in the attached
patch, so it just skips over the headers. Whenever enabled, debug output
shows that headers are parsed correctly, but Nginx crashes with a SIGFPE
after returning an NGX_OK from the process_header handler.

I have no idea what is going on and have spent hours looking over the
code of both the fastcgi and proxy modules. Could you have a look to see
what I’m doing wrong? I am about to pull my hair. :wink:

Thanks.


#12

Spil G. wrote:

I have done my best (see attached patch), but somehow Nginx doesn’t like
the ‘ngx_list_push(&u->headers_in.headers)’ code I copied from the
fastcgi module. The code in question is commented out in the attached
patch, so it just skips over the headers. Whenever enabled, debug output
shows that headers are parsed correctly, but Nginx crashes with a SIGFPE
after returning an NGX_OK from the process_header handler.

I have no idea what is going on and have spent hours looking over the
code of both the fastcgi and proxy modules. Could you have a look to see
what I’m doing wrong? I am about to pull my hair. :wink:

Hi Maxim, Igor, (Evan?),

Would one of you be so kind to look at the patch I submitted last
Thursday:
http://www.ruby-forum.com/attachment/2541/nginx-0.6.32-memcache.broken.patch

I’d like to wrap this up and let the community benefit of these
additions to the memcached module, but I cannot get it to work without
crashing currently. I really need some guidance to become a more skilled
Nginx contributor. :wink:

Thanks,

Matthijs


#13

Hello!

On Tue, Aug 19, 2008 at 04:02:14PM +0200, Spil G. wrote:

what I’m doing wrong? I am about to pull my hair. :wink:
crashing currently. I really need some guidance to become a more skilled
Nginx contributor. :wink:

I have it in my TODO and will take a look as soon as time permits.

Maxim D.


#14

Hello!

On Thu, Aug 14, 2008 at 05:10:32PM +0200, Spil G. wrote:

the ‘ngx_list_push(&u->headers_in.headers)’ code I copied from the
fastcgi module. The code in question is commented out in the attached
patch, so it just skips over the headers. Whenever enabled, debug output
shows that headers are parsed correctly, but Nginx crashes with a SIGFPE
after returning an NGX_OK from the process_header handler.

I have no idea what is going on and have spent hours looking over the
code of both the fastcgi and proxy modules. Could you have a look to see
what I’m doing wrong? I am about to pull my hair. :wink:

The problem is that nginx isn’t prepared to work with headers
without properly initialized upstream.hide_headers hash. Since
memcached module doesn’t return headers - it doesn’t bother with
initializing the hash. With initialization added (see
fastcgi/proxy modules for examples) your patch works as expected.
I haven’t tested it much though.

Maxim D.


#15

Maxim D. wrote:

The problem is that nginx isn’t prepared to work with headers
without properly initialized upstream.hide_headers hash. Since
memcached module doesn’t return headers - it doesn’t bother with
initializing the hash. With initialization added (see
fastcgi/proxy modules for examples) your patch works as expected.
I haven’t tested it much though.

Hi Maxim,

Thanks for taking the time. I really appreciate it! The patch seems to
work like a charm now.

What about the 404 vs 503 thing? Do you have any opinions what would be
the best way to solve this? Making the status code for a not_found
configurable should not be too hard, but I’d like to know if you think
that’s the way to go forward.

Matthijs


#16

Hi,

I’ve just completed the “final” version of this patch. It allows both
custom headers and custom response codes to be returned from memcached.
There is one restriction though: it is not possible to use an
‘error_page’ directive on the outcome of the memcached module. This was
neccessary to be able to distinguish between a 404 response successfully
served from memcached and a 404 response generated because no cache item
could be found.

This behaviour is triggered on a successfull hit from memcached:

/* disallow interception of error pages */
u->conf->intercept_404 = 0;
u->conf->intercept_errors = 0;
r->error_page = 1;

Above code disables both interception of errors and custom error pages.
It has been tested using the following Nginx config:

default_type text/html;
error_page 404 = /dyn$request_uri;
if ($request_method ~* "^(GET|HEAD)$") {
    set $memcached_key $server_name$request_uri;
    memcached_pass memcached_backend;
}

The ‘error_page 404’ will only be triggered when the memcached module
returns a 404 because of a missing cache item, NOT when a 404 response
is served from the cache.

Feedback highly appreciated. This is my first ‘hack job’ on Nginx, but I
hope this will be good enough to be included into the official
distribution of Nginx. Igor?

Regards,

Matthijs


#17

Spil G. wrote:

What about the 404 vs 503 thing? Do you have any opinions what would be
the best way to solve this? Making the status code for a not_found
configurable should not be too hard, but I’d like to know if you think
that’s the way to go forward.

I’ve spend some more time on this today and have found that making the
response code configurable it not so easy as it seems, because some 404
stuff is hardcoded in the upstream module.

Then I did some experiments with setting and checking a
$memcached_not_found variable, but that failed too because I could not
make Nginx execute the memcached stuff first and check the variable
afterwards.

Another thought is to disable error_pages completely when serving a
succesful hit from the memcached module. As a quick hack I tried this:

/* disable error pages */
clcf = ngx_http_get_module_loc_conf(r, ngx_http_core_module);
clcf->error_pages = NULL;

That works, but disables error_pages permanently. I have no idea how to
do something like this only temporarily (for the current request).

Any ideas? Or should I just release the current memcache patch with the
503 stuff removed? Ofcourse I’d still be using the 503 stuff internally
as an its-a-hack-but-it-works ™ thing. :wink: