So is "rewrite_by_lua" also evil?

On Fri, Oct 14, 2011 at 12:10 AM, Nginx U. [email protected]
wrote:

Rebuilt with 0.31rc11. Also noticed note about, and implemented,
module loading order.
Everything seems straight as an arrow and no evil going ons detected so far.

Great!

But please note that this is really a bug in the nginx core, so what
I’m doing in ngx_lua’s ngx.exec() (as well as ngx_echo’s echo_exec) is
just a work-around. If you use other module to issue the jump to a
named location, bad things could still happen (or in your words, evil
stuffs). For example, ngx_upload’s upload_pass directive or the core
module’s post_action directive.

I’ve already posted a patch for the nginx core to the nginx-devel
mailing list, which fixes the issue completely:

http://mailman.nginx.org/pipermail/nginx-devel/2011-October/001327.html

That patch can also be applied cleanly to nginx 1.0.8 (at least).

BTW, this patch has already been incorporated into the ngx_openresty
1.0.8.5 devel release: OpenResty® - Open source

Regards,
-agentzh

On 14 October 2011 03:36, agentzh [email protected] wrote:

just a work-around. If you use other module to issue the jump to a
named location, bad things could still happen (or in your words, evil
stuffs). For example, ngx_upload’s upload_pass directive or the core
module’s post_action directive.

I’ve already posted a patch for the nginx core to the nginx-devel
mailing list, which fixes the issue completely:

[PATCH] clear modules' contexts in ngx_http_named_location

That patch can also be applied cleanly to nginx 1.0.8 (at least).

Strange, looking at the message from Maxim to Igor, that this hadn’t
been changed before.
If you refer to the message I sent just before yours, you will find I
am still seeing this behaviour for some requests and that I rolled
everything back to my last known working config.

Perhaps the work around need to go into some other one of the modules I
use.

I’ll look into applying the patch to the 1.0.8 core directly and retry
later.

BTW, this patch has already been incorporated into the ngx_openresty
1.0.8.5 devel release: OpenResty® - Open source

Nice.

I will be building openresty rpms for redhat presently. Wanted to try
some of the elements first.

On 14 October 2011 14:01, Eugaia [email protected] wrote:

break the code. I would say it should probably just be left up to the
modules to reset their own contexts if they need to (as is the case now), or
to just define multiple modules which would use different contexts anyway.

That throws a different spot light on things.

I don’t know about the guts of Nginx but I suppose the main point
would be for things to work consistently. I.E. if this context is
cleared in some places in the core (normal locations), then, it should
be cleared in all similar places (all locations) except of course if
there is specific good reason not to. Module developers should have a
clear understanding of this “feature”.

Think I will go into holding mode with the standard rewrite module
(which works without issue) for now.
Just that lua allows for more complex logic for my tests.

Thanks for your efforts chaps.

On 14 October 2011 03:16, Nginx U. [email protected] wrote:

Unfortunately, I am still seeing the “0 0” outputs in my logs on (only
some) requests. I had noticed that some pages just seemed to have the
“loading” wheel spin for ever and on looking, I found the instances.

It is not affecting standard php requests anymore from what I can see
but it seems some Ajax type requests where php is being used for
dynamic js (! need to look closer into how I set those up) and with
phpMyAdmin which is aliased somewhere.
Looking at things further, it may be a mime type issue clouding things.
My js files use the non-standard but widely adopted “text/javascript”
while Nginx comes with the standard but not always supported
“application/X-javascript” mime type.
Not sure I changed the mimes types file accordingly when I went to
1.0.8.

The phpMyadmin one is funny though as all it does is use an alias.

On 14 October 2011 16:27, Nginx U. [email protected] wrote:

Looking at things further, it may be a mime type issue clouding things.
My js files use the non-standard but widely adopted “text/javascript”
while Nginx comes with the standard but not always supported
“application/X-javascript” mime type.
Not sure I changed the mimes types file accordingly when I went to 1.0.8.

The phpMyadmin one is funny though as all it does is use an alias.

OK. I went back to nginx1.0.8 and 0.31rc11, made sure everything was
proper and made sure I only used rewrite by lua to trigger the jump

server {

location @pretty_urls {
# some rewrite rules to php
}
location @proxy {
include /etc/nginx/firewall.default;
proxy_pass http://127.0.0.1:8080;

}
location ~ ^.+.php$ {
content_by_lua ‘ngx.exec(“@proxy”);’;
}
location / {
try_files $uri $uri/ @pretty_urls;
}
}

Everything works perfectly and not a hint of evil behaviour. Not sure
what I missed before but I am now as happy as Larry.

Now I can implement the second stage of my WAF implementation plan
where firewall.default would query PHPIDS (https://phpids.org/)
instead of my homebrew regexes.

On 14 October 2011 22:24, Nginx U. [email protected] wrote:

location ~ ^.+.php$ {
content_by_lua ‘ngx.exec(“@proxy”);’;
}
That should read

  location ~ ^.+\.php$ {
          rewrite_by_lua 'ngx.exec("@proxy");';
  }

On 13/10/2011 07:55, agentzh wrote:

Well, I still believe it is a bug in ngx_http_named_location function
in the nginx core. The thumb of rule is to avoid using named locations
like @proxy but use normal locations configured with the “internal”
directive. And we’ve been keeping doing this in our production apps.
In some situations it’s probably the desired effect to reset the
contexts, but not all. There are multiple situations in modules I’ve
developed / am developing where resetting contexts would be highly
undesirable and would break the code. I would say it should probably
just be left up to the modules to reset their own contexts if they need
to (as is the case now), or to just define multiple modules which would
use different contexts anyway.

Marcus.

Additional feedback to agentzh:

Everything seems to be working fine. However, a small configuration
error showed a potential problem in that the rewrite_by_lua directive
does not seem to take account of the rewrite module’s “last” flag.

Contrived example:

GET /sometext/xyz.html

server {

location @pretty_urls {
rewrite ^/sometext/xyz.html$ /abc.html last;

          # line below included by mistake and shouldn't have been 

there
# On the other hand, the hit above should mean it is not
called anyway
rewrite_by_lua ‘ngx.exec(“@proxy”);’;
}
location @proxy {
include /etc/nginx/firewall.default;
proxy_pass http://127.0.0.1:8080;

}
location ~ ^.+.php$ {
content_by_lua ‘ngx.exec(“@proxy”);’;
}
location / {
try_files $uri $uri/ @pretty_urls;
}
}

… generates an error along the lines of …

2011/10/14 20:10:26 [error] 17325#0: *16 lua handler aborted: runtime
error: attempt to call a nil value, blah, blah, blah".

As said, it was a misconfig but I would have expected that with the
earlier hit with the rewrite module, all further rewrites, including
ngx_lua’s should stop.

Cheers

On Fri, Oct 14, 2011 at 8:16 AM, Nginx U. [email protected] wrote:

Unfortunately, I am still seeing the “0 0” outputs in my logs on (only
some) requests. I had noticed that some pages just seemed to have the
“loading” wheel spin for ever and on looking, I found the instances.

Would you mind sharing the debug logs for such bad responses? I’d have
a closer look at these :slight_smile:

Thanks!
-agentzh

On 15 October 2011 04:27, agentzh [email protected] wrote:

On Fri, Oct 14, 2011 at 8:16 AM, Nginx U. [email protected] wrote:

Unfortunately, I am still seeing the “0 0” outputs in my logs on (only
some) requests. I had noticed that some pages just seemed to have the
“loading” wheel spin for ever and on looking, I found the instances.

Would you mind sharing the debug logs for such bad responses? I’d have
a closer look at these :slight_smile:

I stopped getting these responses. I rolled back to my 1.0.6 install
and later redid the move the 1.0.8 with all the components more
carefully and it stopped at that point.

On Sat, Oct 15, 2011 at 4:41 AM, Nginx U. [email protected] wrote:

  rewrite ^/sometext/xyz\.html$ /abc.html last;

}

Interesting. But I cannot reproduce this issue with the following
small example on my side:

location @pretty {
    rewrite ^/main/xyz\.html$ /abc.html last;
    rewrite_by_lua 'ngx.exec("@proxy")';
}
location @proxy {
    echo hello;
}
location /abc.html {
    echo abc;
}
location /main {
    try_files $uri $uri/ @pretty;
}

Then GET /main/xyz.html yields the output “abc” as expected and I
don’t see that ngx_lua error message in my error.log.

Could you show me the error.log with --with-debug + debug log level?

Regards,
-agentzh

On Sat, Oct 15, 2011 at 4:46 PM, Nginx U. [email protected] wrote:

rewrite ^/(abc|forum|xyz)/?$ “/index.php?pretty;action=$1” last;
Okay, I must say that ngx_rewrite’s “last” won’t affect rewrite_by_lua
because rewrite_by_lua is not part of that module (and there’s no
known way to achieve that).

So it’s recommended to do URL rewrite all in rewrite_by_lua. Some
people has pointed out that ngx.exec() is a little heavy as compared
to ngx_rewrite’s rewrite directive. I may implement an ngx.rewrite()
function for ngx_lua in the near future and it’s an easy hack :slight_smile:

But set_by_lua will be affected because that directive is injected
into ngx_rewrite’s instruction list by means of ngx_devel_kit’s
set_var submodule.

Regards,
-agentzh

On 15 October 2011 09:36, agentzh [email protected] wrote:

location /abc.html {
echo abc;
}
location /main {
try_files $uri $uri/ @pretty;
}

Then GET /main/xyz.html yields the output “abc” as expected and I
don’t see that ngx_lua error message in my error.log.

Could you show me the error.log with --with-debug + debug log level?

http://pastebin.com/estie4W8

GET /forum/

server {
root /home/user/testsite.com/public_html/test_app

location @pretty_urls {
# A number of rewrite rule. Hit line replicated

rewrite ^/(abc|forum|xyz)/?$ “/index.php?pretty;action=$1” last;

   rewrite_by_lua 'ngx.exec("@proxy");';

}
location @proxy {
include /etc/nginx/firewall.default;
proxy_pass http://127.0.0.1:8080;

}
location ~ ^.+.php$ {
rewrite_by_lua ‘ngx.exec(“@proxy”);’;
}
location / {
try_files $uri $uri/ @pretty_urls;
}
}

On 15 October 2011 12:40, agentzh [email protected] wrote:


rewrite ^/(abc|forum|xyz)/?$ “/index.php?pretty;action=$1” last;

Okay, I must say that ngx_rewrite’s “last” won’t affect rewrite_by_lua
because rewrite_by_lua is not part of that module (and there’s no
known way to achieve that).
I guessed as much. I didn’t need that line there in this case and if
neeed, “try_files /dummy/ @proxy” would do the job.
I like ngx_lua because it allows me to jump locations. Needs hacks to
do this with the rewrite module. Either the try_files hack above or
mapping an obscure error code to a location and using “return code”.

All in all, I have come to realise that ngx_lua is slightly evil in
that unexpected things can happen. As with “if” however, evil
understood is evil avoided :slight_smile:

Any thoughts on Marcus’ points earlier on clearing the ctx in the core?
Seems to suggest that may cause problems elsewhere and that it is
better to be handled at the module level.
Seems logical as whether it was originally an oversight or not,
changing it now would break other stuff.

On Sat, Oct 15, 2011 at 5:57 PM, Nginx U. [email protected] wrote:

All in all, I have come to realise that ngx_lua is slightly evil in
that unexpected things can happen. As with “if” however, evil
understood is evil avoided :slight_smile:

The ngx_lua module won’t play seamlessly with ngx_rewrite because
ngx_rewrite does use internal hacks and tricks to make things work. So
I’d recommend use Lua to do as much as possible and avoid using
ngx_rewrite’s directives but simple “set”.

Any thoughts on Marcus’ points earlier on clearing the ctx in the core?

Well, there’re indeed cases that we do want to retain module contexts
during a jump but I’ve already managed to “abuse” the post_subrequest
mechanism to save the pointer to my module’s context and later restore
it in my output header filter, for example. (See how I do this in my
ngx_lua module for ngx.location.capture.)

But on the other hand, if I do want to clear my module’s context
before a internal redirection, then there’s no easy way to do that
myself.

Seems to suggest that may cause problems elsewhere and that it is
better to be handled at the module level.

No, this is impossible to handle that fully on the module level. For
example, if you do use ngx_upload module’s upload_pass instead of
ngx_lua or ngx_echo to do the jump, then it’ll still break
rewrite_by_lua and access_by_lua (at least).

Seems logical as whether it was originally an oversight or not,
changing it now would break other stuff.

I doubt it :slight_smile: Unless one only jump to named locations and never to
normal locations. The behaviors here are already inconsistent and very
hard to work around.

Regards,
-agentzh

On 15/10/2011 12:40, agentzh wrote:

Okay, I must say that ngx_rewrite’s “last” won’t affect rewrite_by_lua
because rewrite_by_lua is not part of that module (and there’s no
known way to achieve that).
You could achieve that now by creating a dummy variable, and instead of
using a new handler for the rewrite by code, you wrap the code around
the setby lua code. It would waste a few cycles, since it would be
setting a variable unnecessarily, but would have the benefit of not
slowing down the execution of code that didn’t use rewrite_by, since it
wouldn’t be adding an extra handler to process for all the other
requests.

A better way would be to extend the NDK and expose a generic
rewrite-phase, script engine interface that other scripting functions
(that are more than just set_xxx) can use, and to use that instead.
You’d then not be wasting cycles on setting a variable unnecessarily.
It’d be mostly similar to the set_var stuff, but would need to take the
normal ‘code’-type functions rather than the NDK setvar function types.

Cheers,

Marcus.

On Sat, Oct 15, 2011 at 6:16 PM, Eugaia [email protected] wrote:

execution of code that didn’t use rewrite_by, since it wouldn’t be adding an
extra handler to process for all the other requests.

I do not fully grok your approach but I’d ask first: will this
approach allow subrequests, internal redirection, non-blocking sleep,
and all the other fancy stuffs?

Regards,
-agentzh

On 15/10/2011 12:57, Nginx U. wrote:

Any thoughts on Marcus’ points earlier on clearing the ctx in the core?
Seems to suggest that may cause problems elsewhere and that it is
better to be handled at the module level.
Seems logical as whether it was originally an oversight or not,
changing it now would break other stuff.
I had a closer look at the code, and in fact clearing the contexts
automatically won’t break my code in particular because I don’t use
ngx_http_named_location() to find named locations (I find the location
at conf-read time, not run time, since it’s more efficient). There may
be other developers that use the function at runtime, but my guess is
that it’s probably not commonly used by module developers who would not
prefer the contexts to be reset.

I would therefore agree with agentzh and Maxim about it probably being
the best way to go, and should any developers create modules that
currently call ngx_http_named_location() but don’t want to clear the
contexts, then they can just re-write the function to not do so.

Marcus.

On 15 October 2011 13:25, Eugaia [email protected] wrote:

ngx_http_named_location() to find named locations (I find the location at
conf-read time, not run time, since it’s more efficient). There may be
other developers that use the function at runtime, but my guess is that it’s
probably not commonly used by module developers who would not prefer the
contexts to be reset.

I would therefore agree with agentzh and Maxim about it probably being the
best way to go, and should any developers create modules that currently call
ngx_http_named_location() but don’t want to clear the contexts, then they
can just re-write the function to not do so.

Great.

I’ll therefore look into applying agentzh’s patch then. I had held off
because your previous comment.

On Sat, Oct 15, 2011 at 6:43 PM, Eugaia [email protected] wrote:

If you were just wrapping around set_var, then no. It would only be able to
handle non-reentrant code, just like setby.

Gathered so :wink: I asked because I didn’t want to miss any quick magic :wink:

In order to have reentrant code, it would need to have a generic layer that
would be able to jump in/out of the script engine (and handle re-calling the
script engine post subrequest etc). The rewrite handler would also need to
be re-written to handle re-entering the script engine.

The advantage of having a generic rewrite re-entrant layer, though, would be
that set_var would also be able to accept subrequests etc (assuming you used
that to base setby on).

I’ve been waiting for this feature for more than a year now :wink:

Thanks!
-agentzh