Ngx_lua location capture issue

I tried to create the following scenario:

  1. Request test_page.php
  2. lua exec to @checkpoint
  3. @checkpoint does capture location to test_loc (future phpids)
  4. test_loc/index.php returns either 200 or 403 status
  5. @checkpoint continues or halts request accordingly

GET /test_page.php

server {
listen 80;
server_name testsite.com;
root /home/user/testsite.com/public_html;
location @checkpoint {
access_by_lua ’
local res = ngx.location.capture(“/test_loc”)

        if res.status == ngx.HTTP_OK then
            return
        end

        if res.status == ngx.HTTP_FORBIDDEN then
            ngx.exit(res.status)
        end

        ngx.exit(ngx.HTTP_INTERNAL_SERVER_ERROR)
    ';
}

location @proxy {
include /etc/nginx/firewall.default;

# Block IPs in Spamhaus drop list
if ($block = 1) {
  return 444;
}

proxy_pass http://127.0.0.1:8080;
...

}
location /error_docs {
internal;
alias /home/user/$host/error_docs;
}
location /test_loc {
internal;
alias /usr/share/test_loc/;
rewrite_by_lua ‘ngx.exec(“@proxy”);’;
}
location / {
try_files $uri $uri/ @proxy;
}
location ~ .+.php$ {
location ~ ^/test_page.php$ {
rewrite_by_lua ‘ngx.exec(“@checkpoint”);’;
}

rewrite_by_lua 'ngx.exec("@proxy");';

}
}

Result is “the http output chain is empty while connecting to
upstream” blah blah blah and output is equivalent to issuing “return
444”. http://pastebin.com/7WisVBDU

Logs seem to show “GET /test_page.php” being run a second time.

Any tips on fixing?

Cheers

On 18 October 2011 19:55, Nginx U. [email protected] wrote:

listen 80;
if res.status == ngx.HTTP_FORBIDDEN then
if ($block = 1) {
location /test_loc {
}

Any tips on fixing?

Cheers

A more considered read of the docs shows I had been mixingsubrequests
and internal redirections all over the place and that this works as
expected

GET /test_page.php

server {
listen 80;
server_name testsite.com;
root /home/user/testsite.com/public_html;
location @proxy {
include /etc/nginx/firewall.default;

# Block IPs in Spamhaus drop list
if ($block = 1) {
  return 444;
}

proxy_pass http://127.0.0.1:8080;
...

}
location /error_docs {
internal;
alias /home/user/$host/error_docs;
}
location /test_loc {
internal;
alias /usr/share/test_loc/;
rewrite_by_lua ‘ngx.exec(“@proxy”);’;
}
location / {
try_files $uri $uri/ @proxy;
}
location ~ .+.php$ {
location ~ ^/test_page.php$ {
access_by_lua ’
local res = ngx.location.capture(“/phpids”)
if res.status == ngx.HTTP_OK then
ngx.exec(“@proxy_no_cache”)
end
if res.status == ngx.HTTP_FORBIDDEN then
ngx.exit(res.status)
end
ngx.exit(ngx.HTTP_INTERNAL_SERVER_ERROR)
';
}

rewrite_by_lua 'ngx.exec("@proxy");';

}
}

No need for @checkpoint. Just put the access check directly in place.

Cool beans!

On Wed, Oct 19, 2011 at 12:55 AM, Nginx U. [email protected]
wrote:

location /test_loc {
internal;
alias /usr/share/test_loc/;
rewrite_by_lua ‘ngx.exec(“@proxy”);’;

Could you please try using content_by_lua here instead of
rewrite_by_lua? I want to confirm that it is indeed a bug in
rewrite_by_lua rather than content_by_lua :slight_smile:

}
location / {
try_files $uri $uri/ @proxy;
}
location ~ .+.php$ {
location ~ ^/test_page.php$ {
rewrite_by_lua ‘ngx.exec(“@checkpoint”);’;

Same here :slight_smile:

Result is “the http output chain is empty while connecting to
upstream” blah blah blah and output is equivalent to issuing “return
444”. http://pastebin.com/7WisVBDU

Thanks for the debug log. It helped a lot :slight_smile:

Any tips on fixing?

Please try content_by_lua instead of rewrite_by_lua to call
ngx.exec(). It can help me fix the issue here :slight_smile:

Thanks for the report!
-agenzh

On Wed, Oct 19, 2011 at 1:28 AM, Nginx U. [email protected] wrote:

A more considered read of the docs shows I had been mixingsubrequests
and internal redirections all over the place and that this works as
expected

Mixing subrequests and internal redirections should be fine for recent
releases of ngx_lua. (I’ve found a way to survive my ngx_lua’s context
though an internal redirection! :D)

Which part of the documentation is giving you this invalid
implication? I’d fix the doc :slight_smile:

Thanks!
-agentzh

On Wed, Oct 19, 2011 at 10:33 AM, agentzh [email protected] wrote:

On Wed, Oct 19, 2011 at 12:55 AM, Nginx U. [email protected] wrote:

location /test_loc {
internal;
alias /usr/share/test_loc/;
rewrite_by_lua ‘ngx.exec(“@proxy”);’;

Could you please try using content_by_lua here instead of
rewrite_by_lua? I want to confirm that it is indeed a bug in
rewrite_by_lua rather than content_by_lua :slight_smile:

I think I’ve already fixed this issue in the ngx_lua v0.3.1rc17 release:

https://github.com/chaoslawful/lua-nginx-module/tags

There was a bug in ngx.exec()'s handling when being used within
rewrite_by_lua* and access_by_lua* directives, which could cause
hanging in certain extreme conditions.

Could you please try it out on your side?

Thanks for the report!
-agentzh

P.S. This new version of ngx_lua has also been included in the latest
ngx_openresty 1.0.8.15 devel release here:

http://openresty.org/#Download

On Wed, Oct 19, 2011 at 1:59 AM, Nginx U. [email protected] wrote:

           if res.status == ngx.HTTP_FORBIDDEN then

access by lua within the sub location. Changing the access by lua in
the sub location to rewrite by lua executes the script.

The inner location (i.e., location “^/test_page.php$”) automatically
inherits the rewrite_by_lua directive defined in the outer location
(i.e., location “.+.php$”). So your rewrite_by_lua code runs before
your access_by_lua code in the inner location. And further, your
rewrite_by_lua code does an internal redirection and thus your
access_by_lua never has a chance to run.

Apparently you do not want the inner location to inherit the outer
rewrite_by_lua setting, so just add the following line to your inner
location:

rewrite_by_lua return;

This will override the rewrite_by_lua directive in the outer scope.

In any case, my original config is the preferred to get workingas I
need the request params to be passed along the redirection chain.

See my last reply :slight_smile:

Better wait until agentzh wakes up in his timezone.

10:41 AM here already :wink:

Best,
-agentzh

On 19 October 2011 05:49, agentzh [email protected] wrote:

Which part of the documentation is giving you this invalid
implication? I’d fix the doc :slight_smile:
Hi. I didn’t mean to imply that the docs said there was anything wrong
with this just that it identified that some items were internal
redirection and others were subrequests.

I started wondering, all by myself, whether there was an implication
… with post requests for instance.

Jumped the gun there. There’s a sting in the tail.

GET /test_page.php

   location ~ .+\.php$ {
           location ~ ^/test_page\.php$ {
                   access_by_lua '
                           local res = 

ngx.location.capture("/phpids")
if res.status == ngx.HTTP_OK then
ngx.exec("@proxy")
end
if res.status == ngx.HTTP_FORBIDDEN then
ngx.exit(res.status)
end
ngx.exit(ngx.HTTP_INTERNAL_SERVER_ERROR)
';
}

           rewrite_by_lua 'ngx.exec("@proxy");';
   }

Will result in the rewrite by lua line being executed instead of the
access by lua within the sub location. Changing the access by lua in
the sub location to rewrite by lua executes the script.

In any case, my original config is the preferred to get workingas I
need the request params to be passed along the redirection chain.

Better wait until agentzh wakes up in his timezone.

On 19 October 2011 12:24, agentzh [email protected] wrote:

I think I’ve already fixed this issue in the ngx_lua v0.3.1rc17 release:

Tags · openresty/lua-nginx-module · GitHub

There was a bug in ngx.exec()'s handling when being used within
rewrite_by_lua* and access_by_lua* directives, which could cause
hanging in certain extreme conditions.

Could you please try it out on your side?

I’ll have a go in due course. I have dumped the config originally
posted. Will look into reconfiguring along those lines later.

In the meantime, can you help out with my fallback?

I have a set of about 70 regexes that are run using access_by_lua_file
after finding out that access_by_lua has a string length limit
(undefined).

I have applied the escaping sequences and they generally work but
these ones generate errors:

– 42
– original
“(?:”\sor\s"?\d)|(?:\x(?:23|27|3d))|(?:^.?“$)|(?:(?:^[”\](?:[\d"]+|[^“]+”))+\s(?:n?and|x?or|not||||&&)\s*[\w"[+&!@(),.-])|(?:[^\w\s]\w+\s*[|-]\s*“\s*\w)|(?:@\w+\s+(and|or)\s*[”\d]+)|(?:@[\w-]+\s(and|or)\s*[^\w\s])|(?:[^\w\s:]\s*\d\W+[^\w\s]\s*“.)|(?:\Winformation_schema|table_name\W)”
local query_string =
ngx.re.match(ngx.var.request_uri,“(?:”\\sor\\s"?\\d)|(?:\\\x(?:23|27|3d))|(?:^.?“$)|(?:(?:^[”\\\](?:[\\d"]+|[^“]+”))+\\s(?:n?and|x?or|not|\|\||\&\&)\\s*[\\w"[+&!@(),.-])|(?:[^\\w\\s]\\w+\\s*[|-]\\s*“\\s*\\w)|(?:@\\w+\\s+(and|or)\\s*[”\\d]+)|(?:@[\\w-]+\\s(and|or)\\s*[^\\w\\s])|(?:[^\\w\\s:]\\s*\\d\\W+[^\\w\\s]\\s*“.)|(?:\\Winformation_schema|table_name\\W)”,
“io”)
– 45
– original
“(?:union\s*(?:all|distinct|[(!@])?\s[([]\sselect)|(?:\w+\s+like\s+")|(?:like\s*”%)|(?:“\slike\W[”\d])|(?:“\s*(?:n?and|x?or|not
||||&&)\s+[\s\w]+=\s*\w+\shaving)|(?:"\s*\s*\w+\W+”)|(?:“\s*[^?\w\s=.,;)(]+\s*[(@”]\s\w+\W+\w)|(?:select\s*[[]()\s\w.,“-]+from)|(?:find_in_set\s*()”
local query_string = ngx.re.match(ngx.var.request_uri,
“(?:union\\s*(?:all|distinct|[(!@])?\\s[([]\\sselect)|(?:\\w+\\s+like\\s+\")|(?:like\\s*”\%)|(?:“\\slike\\W[”\\d])|(?:“\\s*(?:n?and|x?or|not
|\|\||\&\&)\\s+[\\s\\w]+=\\s*\\w+\\shaving)|(?:"\\s\*\\s*\\w+\\W+”)|(?:“\\s*[^?\\w\\s=.,;)(]+\\s*[(@”]\\s\\w+\\W+\\w)|(?:select\\s*[\[\]()\\s\\w\.,“-]+from)|(?:find_in_set\\s*\()”,
“io”)

Thanks

On Thu, Oct 20, 2011 at 2:10 AM, Nginx U. [email protected] wrote:

I’ll have a go in due course. I have dumped the config originally
posted. Will look into reconfiguring along those lines later.

Cool :slight_smile:

In the meantime, can you help out with my fallback?

I have a set of about 70 regexes that are run using access_by_lua_file
after finding out that access_by_lua has a string length limit
(undefined).

It is the nginx config file parser has a length limit on individual
config directives. Use access_by_lua_file instead to put your Lua code
into an external .lua file and thus eliminating the length constraint
altogether.

I have applied the escaping sequences and they generally work but
these ones generate errors:

– 42
– original

“(?:”\sor\s"?\d)|(?:\x(?:23|27|3d))|(?:^.?“$)|(?:(?:^[”\](?:[\d"]+|[^“]+”))+\s(?:n?and|x?or|not||||&&)\s*[\w"[+&!@(),.-])|(?:[^\w\s]\w+\s*[|-]\s*“\s*\w)|(?:@\w+\s+(and|or)\s*[”\d]+)|(?:@[\w-]+\s(and|or)\s*[^\w\s])|(?:[^\w\s:]\s*\d\W+[^\w\s]\s*“.)|(?:\Winformation_schema|table_name\W)”

local query_string =

ngx.re.match(ngx.var.request_uri,“(?:”\\sor\\s"?\\d)|(?:\\\x(?:23|27|3d))|

A rule of thumb is that "" should become “\\” (four back-slashes!)
while using access_by_lua and “\” (two back-slashes!) while using
access_by_lua_file. You’re writing things like “\\s” (three
back-slashes!), which is weird.

Why not put your Lua code into an external .lua file and use
access_by_lua_file or rewrite_by_lua_file instead? That way you only
need to escape each "" only once.

Regards,
-agentzh

On 20 October 2011 03:43, agentzh [email protected] wrote:

On Thu, Oct 20, 2011 at 2:10 AM, Nginx U. [email protected] wrote:

I have a set of about 70 regexes that are run using access_by_lua_file
A rule of thumb is that "" should become “\\” (four back-slashes!)
while using access_by_lua and “\” (two back-slashes!) while using
access_by_lua_file. You’re writing things like “\\s” (three
back-slashes!), which is weird.

Why not put your Lua code into an external .lua file and use
access_by_lua_file or rewrite_by_lua_file instead? That way you only
need to escape each "" only once.
I am using an external .lua file and I only get consistent results
when I use “\\s” etc. “\s” etc resulted is several “)” expected near
“|” type messages.

Basically, it’s what seems to work for me.

On Thu, Oct 20, 2011 at 4:17 PM, Nginx U. [email protected] wrote:

Why not put your Lua code into an external .lua file and use
access_by_lua_file or rewrite_by_lua_file instead? That way you only
need to escape each "" only once.
I am using an external .lua file and I only get consistent results
when I use “\\s” etc. “\s” etc resulted is several “)” expected near
“|” type messages.

“\\s” is essentially equivalent to “\s” in Lua string literals
because “\s” evaluates to “s”.

You can try small examples on your shell (ensure you have the “lua”
interpreter visible in your PATH environment):

$ lua -e 'print("\s" == "s")'
true

$ lua -e 'print("\\\s" == "\\s")'
true

$ lua -e 'print("\\s")'
\s

$ lua -e 'print("\\\s")'
\s

Basically, it’s what seems to work for me.

I suggest you design trivial samples like above to get clear what is
going on here :slight_smile:

Regards,
-agentzh

On Thu, Oct 20, 2011 at 6:48 PM, Nginx U. [email protected] wrote:

Would the Nginx string literal you mentioned before not then turn
“\s” into “\s” … which is were I want to be in the end? I suspect I
am missing something in the process.

The Nginx config file parser first parses Nginx string literal

 'ngx.re.match("\\\\s")'

into Lua code

ngx.re.match("\\s")

and then Lua code parser parse the Lua string literal “\s” into the
character string

\s

which is the regex pattern that you want.

Wish I could just use the familiar "" and have it figured out in the
background without me having to worry about it as rewrite apparently
does though.

Nginx does have builtin support for regex syntax, so does Perl and
JavaScript, and that’s why you do not have to esacpe "" in that
context. Unfortunately Lua does not have such built-in regex syntax
support and the ngx.re API was implemented by ngx_lua and by no means
to be a real a language extension to Lua.

As I said, try using external .lua file and
content/rewrite/access/set_by_lua_file to avoid nginx string escaping
issues.

Regards,
-agentzh

On 20 October 2011 14:48, agentzh [email protected] wrote:

As I said, try using external .lua file and
content/rewrite/access/set_by_lua_file to avoid nginx string escaping
issues.

Understood. However, when I follow your instructions on this, things
fail. They seem to work my way.

Take this regex for example: (?:^>[\w\s]*</?\w{2,}>)

When I use my “incorrect” escaping in access_by_lua file …

   local query_string = ngx.re.match(ngx.var.request_uri,

“(?:^>[\\w\\s]*<\/?\\w{2,}>)”, “io”)
– finds unquoted attribute breaking injections – xss – csrf
– 2
if query_string then
ngx.exit(ngx.HTTP_BAD_REQUEST)
end

… the debug log entry is …

[debug] 24803#0: 154 lua regex cache miss for match regex
"(?:^>[\w\s]
</?\w{2,}>)" with options “io”
[debug] 24803#0: 154 lua compiling match regex
"(?:^>[\w\s]
</?\w{2,}>)" with options “io” (compile once: 1)
[debug] 24803#0: *154 lua saving compiled regex (0 captures) into the
cache (entries 6)
[debug] 24803#0: 154 regex "(?:^>[\w\s]</?\w{2,}>)" not matched on
string “/trackip/?searchip=213.162.113.89” starting from 0

I.E. the match regex, “(?:^>[\w\s]*</?\w{2,}>)” is the same as the
original.

I don’t know why, but it works and the “correct” escaping does not.

So I’m sticking with this until I start to see problems.

Cheers.

On 20 October 2011 11:46, agentzh [email protected] wrote:

because “\s” evaluates to “s”.
Would the Nginx string literal you mentioned before not then turn
“\s” into “\s” … which is were I want to be in the end? I suspect I
am missing something in the process.

Wish I could just use the familiar "" and have it figured out in the
background without me having to worry about it as rewrite apparently
does though.

On Fri, Oct 21, 2011 at 12:02 AM, Nginx U. [email protected]
wrote:

On 20 October 2011 14:48, agentzh [email protected] wrote:
Take this regex for example: (?:^>[\w\s]*</?\w{2,}>)

Good lord!

Why are you using “^” here? Are you meant to match from the very start
of your $request_uri string?

And why are you escaping “/” ? It is not a special thing in the
regex syntax that requires escaping.

As years of Perl programmer, I must say your regex here is by no means
correct.

When I use my “incorrect” escaping in access_by_lua file …

local query_string = ngx.re.match(ngx.var.request_uri,
“(?:^>[\\w\\s]*<\/?\\w{2,}>)”, “io”)
– finds unquoted attribute breaking injections – xss – csrf
– 2
if query_string then
ngx.exit(ngx.HTTP_BAD_REQUEST)
end

I’m not meant to help with Perl compatible regex usage, but here’s my
working version:

-- html/foo.lua
local uri = "<impact>2</impact>"
local regex = '(?:>[\\w\\s]*</?\\w{2,}>)';
ngx.say("regex: ", regex)
m = ngx.re.match(uri, regex, "oi")
if m then
    ngx.say("[", m[0], "]")
else
    ngx.say("not matched!")
end

# nginx.conf
location /re {
    access_by_lua_file html/foo.lua;
    content_by_lua return;
}

GET /re yields

regex: (?:>[\w\s]*</?\w{2,}>)
[>2</impact>]

Regards,
-agentzh

On 21 October 2011 07:13, agentzh [email protected] wrote:

For example, Forefox will escape “3” into “a=%3Ca%3E3%3C/a%3E”,
which will surely never be matched by the regexes used here.

You can try ngx.unescape_uri to preprocess the $request_uri thing first, see:

Lua | NGINX

Good luck!

Thanks for the emails above. I’ll look into tackling the issues
raised in due course.
“/” is escaped because the original regex is from a php application
which uses “/” as a delimiter. I left it in place because the snippet
posted in just a part of the result of a series of “find and replace”
sequences on an xml file
(https://dev.itratos.de/svn/php-ids/trunk/lib/IDS/default_filter.xml)
that changes it to the lua format. “-- 2” is not the
target but just a hang over from the original xml I couldn’t find an
easy find and replace for to cater for all possible “” tags
and is there as a lua comment.

As said, things are working as expected for me at present in that the
resultant regexes are consistent with the target regexes from the xml
file so I am keeping them as they are. I will change them if/when I
see issues.

In any case, this is just a fallback I put in place when trying to
call the actual application and having just recompiled with rc17, I’ll
look into having a go at that again.

Thanks!

On Fri, Oct 21, 2011 at 12:08 PM, agentzh [email protected] wrote:

local query_string = ngx.re.match(ngx.var.request_uri,
“(?:^>[\\w\\s]*<\/?\\w{2,}>)”, “io”)
– finds unquoted attribute breaking injections – xss – csrf
– 2

BTW, it’s bad practice to match against $request_uri directly because
query strings may be escaped according to URI escaping rules. (Yes!
there’s escaping everywhere!)

For example, Forefox will escape “3” into “a=%3Ca%3E3%3C/a%3E”,
which will surely never be matched by the regexes used here.

You can try ngx.unescape_uri to preprocess the $request_uri thing first,
see:

http://wiki.nginx.org/HttpLuaModule#ngx.unescape_uri

Good luck!
-agentzh

On 21 October 2011 10:35, Nginx U. [email protected] wrote:

In any case, this is just a fallback I put in place when trying to
call the actual application and having just recompiled with rc17, I’ll
look into having a go at that again.

“… when trying to call the actual application failed …” .

On 21 October 2011 07:13, agentzh [email protected] wrote:

BTW, it’s bad practice to match against $request_uri directly because
query strings may be escaped according to URI escaping rules. (Yes!
there’s escaping everywhere!)
That’s a great point. ngx.unescape.uri helps get over this.

Better still yet, it is the actual arguments I need to be matching
against and that was just an initial setup thing.

I have since moved on to …

local args = ngx.req.get_uri_args()
for key, val in pairs(args) do
if type(val) == “table” then
my_arg = table.concat(val, ", ")
else
my_arg = val
end
if my_arg then
local query_string = ngx.re.match(my_arg,
“regex_1”, “io”)

local query_string = ngx.re.match(my_arg,
“regex_n”, “io”)

end
end

The val entities are url unescaped by ngx_lua so no issues with that.

Now that this is all working fine (after using "\" in place of the
documented “\” to get it to actually work - and it does work while
the “correct” version does not), I can get back trying out the
location capture again.

Will let you know how that goes in a new thread since I have derailed
this one with the regex issue.

Cheers!