Using Perl Mod to populate Memcached on the fly

This is purely speculative, so please don’t think I know how to do
this. I’m throwing this out so hopefully, if a good idea, the right
people can create the example.

Memcached has a good assortment of Perl clients.

Nginx has a means of embedding Perl into the configuration.

Couldn’t there be a way of combining these to immediately populate the
cache when the cache returns missing?

By the way, I’m looking at this as a means of improving I/O for static
pages (on SliceHost). Ideally nginx’s Memcached Module would have the
ability to do this when the requested filename exists on the hard drive:
(-f $request_filename/index.html)

Daniel Rhoden
Director of Information Technology
Interactive Internet Websites, Inc.
3854 - 2 Killearn Court
Tallahassee, Florida 32309
Voice: (256) 878-5554
E-Mail: [email protected]
Website: iiwinc.com

Rhoden,
Yes it can be done and i also thought about doing this but eventually
did
not wanted perl-enabled nginx as the main proxy server.

If you want an example, I can post it on the wiki.
It will be similar to this one.

Basically.
Check if the file exists
If yes, return file
if not, call a backend url
return response

best regards

Daniel,

I see now your other posts,

I do not think that you will get any benifit of using memcached with
static
files.
Nginx is already very optimized at serving static files.

When I tested using memcached vs files on the filesystem, there wasn’t
much difference between them, and sometimes using memcached fared worse
than directly sending the files from the filesystem (this would probably
vary quite a bit, depending on the system).

The reason for this is that the overhead of creating a request to a
memcached server, sending the request, waiting for the response (which
is done asynchronously so as not to block the server), then handling the
response is ‘relatively’ high.

If the memory cache was instead integrated into Nginx, rather than using
an external process like memcached, then it would almost certainly be
quite a bit quicker. As far as I know, there isn’t an internal
memory-cache module for Nginx yet (I’m not sure if ncache has included
memory caching yet, or if they intend on doing so).

As Atif says, there is little benefit to using memcache for serving
static files, but there definitely is for dynamically-generated files.

As for Perl, I believe the current implementation is still blocking
(i.e. you can’t currently generate Perl in the background whilst still
serving static files, you can either do one or the other). This would
kill the performance terribly.

For populating the cache, have you looked at the error_file directive.
You can set the error file to script which could be passed to an FCGI
process, so that on a cache-miss the script is called to generate the
page, which could in the process put the file into the memcached cache.
This of course only really makes sense for dynamically-generate content.

Marcus.

Atif,

I understood it the same way you did, and was saying that serving files
from memcached as opposed to the filesystem has negligible benefits, and
sometimes actually performs worse.

The reference to error_page (not error_file as I said before - memory
slip) was just as a means to populate memcache when there’s no data in
the cache - you could easily use this method to put the file into
memcache from the filesystem. You wouldn’t need to use an Nginx module
to put the file into memcache (at least not with the current Perl module

  • it’s blocking, so would slow everything down too much, though with a
    non-blocking version it could be ok) - though I understand that that’s
    what he’s talking about and where he’s talking about hooking it in.

Another method would be to use the try_files directive.

Overall, though, I can only see populating memcache with a file to serve
through Nginx as being overall slower than serving the file from the
filesystem. In my tests, memcache vs filesystem is pretty similar, and
adding any kind of overhead to put a file in memcache will mean just
serving the file statically will be more efficient (and won’t waste
memory unnecessarily). I could see storing static files in an internal
memory cache being a bit quicker than serving files statically, but
actually not all that much, and would only possibly be of real benefit
for files that are under very high demand (e.g. a homepage or a logo on
a high-traffic site).

Cheers,

Marcus.

Given that the

On Thu, Mar 5, 2009 at 12:34 PM, Marcus C. [email protected] wrote:

For populating the cache, have you looked at the error_file directive. You
can set the error file to script which could be passed to an FCGI process,
so that on a cache-miss the script is called to generate the page, which
could in the process put the file into the memcached cache.

Marcus,

I believe Daniel was talking about the opposite. He want to populate
the
cache if nginx hits the file instead of when it does not find it.

So something like this (this is how nginx would do it

request comes from /files/1.txt

  1. Check if memcache exists and serve from there.
  2. Check if file exist and serve from here. <---- This is where he
    wants to
    hook it :slight_smile:
  3. If file does not exists, handle error.

What Daniel wants (if I understood correctly)
request comes from /files/1.txt

  1. Check if memcache exists and serve from there.
  2. if(-f file ) call the perl/fcgi process that populates the memcache
    cache
    and returns the file. (so this would be done only where there is no
    cache in
    memcache)
  3. If file does not exists, handle error.

Still I dont see whats the point of this but yes it is doable.

best regards

Daniel,
I see this as a good compromise for your situation.

There has been a growing trend in using VPS like Xen to host sites.
What I’m seeing is that memory is guaranteed, but the I/O to the hard
drive is competing with the other customers on the same physical server.

I’ve given up on the idea of using Perl to populate memcached on the
fly for anything, static html or dynamic. I agree that the best place
for memory caching of static content is right with nginx.

Did some analysis and found that about 1% of our pages get 40% of the
traffic. Since these are static pages it is unlikely that they will
change in popularity from day to day. So I’ve decided that a cron job
can populate the cache for these few pages (and then some) with very
little overhead, very little memory, with the biggest payoff.

Thank you all for your feedback on this idea.

By the way: I am seeing a boost in performance by caching static
files. Small, but every ms counts.

Marcus,

I agree with you. It was a very strange scenario.

Daniel,

A cron job sounds like a good idea to me too.

As part of a project I’m working on, I’ll be developing an in-memory
cache for Nginx. I’ll let you know when it’s stable, in case you’d like
to try it out.

Wrt VPS’s, have you looked at Open VZ (i.e. Virtuozzo) VPS ISPs? I’m
currently using Tektonic, and have generally been happy with the speed
of their system. For your money, you will typically get more memory
too. The reason is to do with the platform, I believe. I think that
Xen doesn’t allow dynamic changing of memory size, but VZ does - so you
can ‘burst’ your memory to your needs on VZ, whilst still having a
minimum level. I know that Open VZ does have some problems that I don’t
think Xen does, though (e.g. with memory-mapped files, meaning that
Varnish Cache can’t - currently - work on Open VZ VPSs).

The difference in architecture between Xen and Open VZ might mean that
you’d get a better IO performance on that (I think I was getting more
than 5000 req/s serving static files on my $15/mo VPS).

Just something you might want to look at.

Marcus.

You might want to check out the following if you have not seen it
already.

http://www.igvita.com/2008/02/11/nginx-and-memcached-a-400-boost/