Is it possible to connect to memcached and check in access phase?

I’m trying to build a nginx module for the need of my workflow :
In access phase :

  1. Connect to the memcached and get value for key “abcxyz”
  2. If the value is 1, return NGX_HTTP_OK ( 200 )
  3. If the value is 0, return NGX_HTTP_FORBIDDEN ( 403 )

As far as I know, upstream is the only way to connect to memcached, but
it’s using callback mechanism ( asynchronous request ) .
Is it possible to connect to the memcached and wait for result (
synchronous ) , then return corresponding code right inside handler
function of access phase ?

Huy Phan ha scritto:

I’m trying to build a nginx module for the need of my workflow :
In access phase :

  1. Connect to the memcached and get value for key “abcxyz”
  2. If the value is 1, return NGX_HTTP_OK ( 200 )
  3. If the value is 0, return NGX_HTTP_FORBIDDEN ( 403 )

As far as I know, upstream is the only way to connect to memcached, but
it’s using callback mechanism ( asynchronous request ) .

It is not the only way.
It is just the only method implemented, since memcached is used for
generated content.

Is it possible to connect to the memcached and wait for result (
synchronous ) , then return corresponding code right inside handler
function of access phase ?

Of course it is possible, you just have to use or implement a memcached
client.

But doing it synchronously will hurt nginx performance.
You need to implement an asynchronous memcached client.

See the asynchronous DNS resolver, for an example on how to do it (but
it is rather complex).

Manlio P.

Is the DNS resolver module available on nginx so that I can check ?
The main point of my workflow is that everything should be done at
access
phase, so asynchornous is useless in this case isn’t it ?

When you said memcached client, doesn’t it mean doing self-coding
network
programming inside nginx ? I still prefer using the code done by nginx
to
prevent risks if there’s any.

Huy Phan ha scritto:

Is the DNS resolver module available on nginx so that I can check ?

It is not a module.
See src/core/ngx_resolver.h|c

The main point of my workflow is that everything should be done at access
phase, so asynchornous is useless in this case isn’t it ?

No, it is not useless.
Asynchronous means that you don’t block the whole Nginx process while
waiting data from the memcached server.

However, if the memcached serve is on localhost, and since the data you
exchange is small, it should be safe to use socket in blocking mode.

When you said memcached client, doesn’t it mean doing self-coding network
programming inside nginx ?

Yes.
But you can of course use an existing memcached client, if it is
implemented as a C library.
See Google Code Archive - Long-term storage for Google Code Project Hosting.

The new libmemcached also seems to support an asynchronous interface.

But, really, implementing a client that only do a get request is not
hard, if you use socket in blocking mode.

See the source code of the ngx_http_memcached_module.

I still prefer using the code done by nginx to
prevent risks if there’s any.

The are no risks.

And you can’t use the code from the existing memcached module, because
it is implemented so that the data returned by a get request to a
memcached server is sent directly to the HTTP client.

[…]

Manlio P.

However, if the memcached serve is on localhost, and since the data
you exchange is small, it should be safe to use socket in blocking mode.

A few months ago I wrote a basic Nginx module using both the newer
libmemcached library and the older libmemcache by Sean C. to get
a value from memcached and set an Nginx variable. Both were blocking
(i.e. didn’t use an asynchronous approach), and I only tested them on a
local installation of memcached. The quicker of the two was the older
libmemcache library (by quite a lot, actually), probably because the
code itself is simpler.

Typical results with simple testing (using Apache Bench and HTTPerf)
were that getting a variable using libmemcache resulted in at best being
able to serve 50-60% of the requests that is possible with a simple
filesystem-based lookup. Libmemcached was typically around 30-35% of
the rate for looking up a page on the filesystem.

The figures were something like:

4700 req/s for serving a file from disk
2500 req/s for doing a simple get from memcached and setting an Nginx
variable to the result - using libmemcache
1400 req/s for doing the above, but with libmemcached

These are off the top of my head, but I think they’re more or less about
right.

This was a blocking approach, though, of course. A non-blocking
asynchronous get would give much better results.

When you said memcached client, doesn’t it mean doing self-coding
network programming inside nginx ?

Yes.
But you can of course use an existing memcached client, if it is
implemented as a C library.
See http://code.google.com/p/memcached/wiki/Clients

The new libmemcached also seems to support an asynchronous interface.
If the load on your server will be high, then I’d recommend trying to
implement it with the async interface, even though it would be a bit
more complicated to set up.

Good luck,

Marcus.

Hi Manilo,

Thank you for your quick reply :slight_smile:
I’m checking the source code of some memcached client from
google code.
Hope I can find something here.
In fact I’ve already used the code of ngx_http_memcached_module
(using upstream) for my workflow, and it works, but not in the
access phase.
I still don’t get the point when you said we can use
asynchronous in access phase. Because the code is something
like this :

ngx_int_t rc;

static ngx_int_t
ngx_http_mymodule_handler(ngx_http_request_t *r)
{
ngx_http_mymodule_request_init®;

// Do something to let this function wait here
// until the callback is done

// Now I got the result
if (rc)
    return NGX_HTTP_OK;
else return NGX_HTTP_FORBIDDEN;

}

void
ngx_http_mymodule_callback_function(ngx_http_request_t *r)
{
// leave the data to rc;
}

How can the handler function wait for the result of
callback function in this case ?

let’s go more detail in the code, i used libmemcached and implement
like this :
static ngx_int_t
ngx_http_memcached_token_handler(ngx_http_request_t *r)
{
// Some preparation code

//
memcached_behavior_set(memc, MEMCACHED_BEHAVIOR_BUFFER_REQUESTS, 1);
size_t value_length;
uint32_t flags;
value = memcached_get(memc, key, strlen(key),
                      &value_length, &flags, &rc1);

if (strcmp(value,"keyvalu1e") == 0)
    return NGX_OK;

return NGX_HTTP_FORBIDDEN;

}

as I read on google, the line
memcached_behavior_set(memc, MEMCACHED_BEHAVIOR_BUFFER_REQUESTS, 1);
enable the async request and the code work properly.

But i still think that the code above is in the blocking mode.
How do you think ?

Another question is that : is there a way to return NGX_AGAIN in the
handler (
access phase), and get the engine to call back later ? in this case we
can use
upstream with callback mechanism .

unsubscribe

Marcus C. ha scritto:

code itself is simpler.

So, I suspect that a self written client should be even more quicker.
And you can also use UDP.

Did you use a persistent TCP connection?

[…]

Manlio P.

Manlio P. wrote:

tested them on a local installation of memcached. The quicker of the
two was the older libmemcache library (by quite a lot, actually),
probably because the code itself is simpler.

So, I suspect that a self written client should be even more quicker.
Yes, I would have thought so too, especially if you’re only connecting
to one memcached server.

Did you use a persistent TCP connection?

Yes.

And you can also use UDP.

I didn’t try this.

Overall, the performance was clearly most affected by the fact that it
was blocking. An async version I think would show much better results
and closer together, because while being blocking the performance hit is
multiplicative but as async it would be linear.

Marcus.

Huy Phan ha scritto:

asynchronous in access phase. Because the code is something
like this :

[…]

How can the handler function wait for the result of
callback function in this case ?

It should be possible.
The access phase “manager” seems to support a return value of NGX_AGAIN
and NGX_DONE.

When one of these values are returned, Nginx will not try to execute the
next handler.

Unfortunately there are no access phase module that do asynchronous I/O,
so you should do some tests.

I suspect that you should call ngx_http_handler, when the Nginx
event module notifies you that the connection to the memcached server is
ready for reading.
Nginx will call you handler again.

But I’m not sure.

[…]

Manlio

I meant to say additive, not linear.

Marcus.

Marcus C. wrote:

A few months ago I wrote a basic Nginx module using both the newer
libmemcached library and the older libmemcache by Sean C. to get
a value from memcached and set an Nginx variable. Both were blocking
(i.e. didn’t use an asynchronous approach), and I only tested them on a
local installation of memcached. The quicker of the two was the older
libmemcache library (by quite a lot, actually), probably because the
code itself is simpler.

Hi Marcus,

Would it be possible to release the code for this module? A while ago I
wrote some extensions for the original Nginx memcached module
(http://www.ruby-forum.com/topic/160379), so I’m highly interested in
this subject.

Thanks,

Matthijs

Bumping an old thread here - but I recently stumbled over a similar
issue:
I want to do stuff depending on if a key exists in memcached or not,
something in style with

if ($key) {
return 404;
} else {
proxy_pass foo.bar;
}

…which is very similar to below. Has anyone solved this yet?

Thanks,