Partial urlEncoding when using if and $request_uri

Hi,

I have observed strange behavior with nginx rewrites. What happens: I
get
request going to myserver.com/appID/path1/path2/uglyID and I need to
proxy
this to appID.backend.internal/path1/path2/uglyID. The uglyID is URL
encoded
because it contains characters like commas and forward slashes. When I
do
location matching in nginx, nginx will url decode $uri parameter on
which it
does matches. That’s not a problem, when the location matches, I can
extract
the appID and then extract the rest from $request_uri. And here the
issue
happens. If I use:

if ($request_uri ~ ^/[^/]+(/.*)$ ) { set $path $1; }, the uglyID (but
only
uglyID) gets URL encoded again. The $path simply is
/path1/path2/urlEncoded(uglyID). However, when I do

if ($uri ~ ^/[^/]+(/.*)$ ) { set $path $1; }, then $path will be the
urlDecoded(/path1/path2/uglyID) - as expected.

I have tested this on ubuntu 12.04 (Linux ip-10-50-20-57
3.2.0-36-virtual
#57-Ubuntu SMP Tue Jan 8 22:04:49 UTC 2013 x86_64 x86_64 x86_64
GNU/Linux)
and nginx versions nginx/1.1.19 and nginx/1.5.6 - and the behavior is
consistently same in both versions. The nginx config is identical in all
cases, only difference is the name of the variable in the if statement,
where using $request_uri simply implies uglyID (but only uglyID - not
the
whole matched string) will be url encoded again. Meanwhile I have found
possible workaround using maps, where matching on $request_uri works
correctly and doesn’t modify matched data anyhow.

So I wonder, has anyone else experienced this? Is this expected? Why is
the
uglyID (but only uglyID) url encoded again? I would expect either
urlEncoding of the whole match, or none at all - as it doesn’t happen
when I
try to match e.g. $uri… Could this indicate a possible bug?

Thanks,

Michael

Posted at Nginx Forum:

Update: I have turned on rewrite log. The rewrite log shows uglyID
correctly
matched and sent to proxy unencoded. However in TCP dump, I can see that
it
was re-encoded again. Also on the target server, I can see the incoming
request having uglyID re-encoded again.

Posted at Nginx Forum:

Hello!

On Mon, Nov 11, 2013 at 09:51:01AM -0500, mte03 wrote:

consistently same in both versions. The nginx config is identical in all
cases, only difference is the name of the variable in the if statement,
where using $request_uri simply implies uglyID (but only uglyID - not the
whole matched string) will be url encoded again. Meanwhile I have found
possible workaround using maps, where matching on $request_uri works
correctly and doesn’t modify matched data anyhow.

So I wonder, has anyone else experienced this? Is this expected? Why is the
uglyID (but only uglyID) url encoded again? I would expect either
urlEncoding of the whole match, or none at all - as it doesn’t happen when I
try to match e.g. $uri… Could this indicate a possible bug?

Yes, this looks like a bug. Here is a ticket we already have for
this:

http://trac.nginx.org/nginx/ticket/348


Maxim D.
http://nginx.org/en/donation.html

Hello!

On Mon, Nov 11, 2013 at 11:35:34AM -0500, mte03 wrote:

Hi,

I can confirm that using named variables solves the issue (as stated in the
ticket - maybe you can add my findings (rewrite log) to the ticket comments,
as I have no rights to do so). Both

You actually have rights to do so (though some login is required).
But as the ticket already shows how to reproduce the problem, I
don’t think linking a rewrite log will be beneficial.

Is there any way to further improve performance on this kind of matching?

I don’t think there is a measurable performance difference. Use
of map{} might be a bit better as it doesn’t involve if(), see
If is Evil… when used in location context | NGINX.


Maxim D.
http://nginx.org/en/donation.html

Hi,

I can confirm that using named variables solves the issue (as stated in
the
ticket - maybe you can add my findings (rewrite log) to the ticket
comments,
as I have no rights to do so). Both

if ($request_uri ~ ^/[^/]+(?/.)$ ) { set $patch $match; }
and
if ($request_uri ~ ^/[^/]+(?/.
)$ ) { set $match abc; }

do work correctly. However, from the performance standpoint, is this
solution with IF faster (or more recommended) than doing st. like

map $request_uri $path {
~^/[^/]+(?/.*)$ $param;
}

Is there any way to further improve performance on this kind of
matching?

Thanks,

Michael

Maxim D. Wrote:

this to appID.backend.internal/path1/path2/uglyID. The uglyID is URL

3.2.0-36-virtual

whole matched string) will be url encoded again. Meanwhile I have

nginx mailing list
[email protected]
nginx Info Page

Posted at Nginx Forum: