Forum: NGINX partial urlEncoding when using if and $request_uri

2974d09ac2541e892966b762aad84943?d=identicon&s=25 mte03 (Guest)
on 2013-11-11 15:51
(Received via mailing list)
Hi,

I have observed strange behavior with nginx rewrites. What happens: I
get
request going to myserver.com/appID/path1/path2/uglyID and I need to
proxy
this to appID.backend.internal/path1/path2/uglyID. The uglyID is URL
encoded
because it contains characters like commas and forward slashes. When I
do
location matching in nginx, nginx will url decode $uri parameter on
which it
does matches. That's not a problem, when the location matches, I can
extract
the appID and then extract the rest from $request_uri. And here the
issue
happens. If I use:

if ($request_uri ~ ^/[^\/]+(/.*)$ ) { set $path $1; }, the uglyID (but
only
uglyID) gets URL encoded again. The $path simply is
/path1/path2/urlEncoded(uglyID). However, when I do

if ($uri ~ ^/[^\/]+(/.*)$ ) { set $path $1; }, then $path will be the
urlDecoded(/path1/path2/uglyID) - as expected.

I have tested this on ubuntu 12.04 (Linux ip-10-50-20-57
3.2.0-36-virtual
#57-Ubuntu SMP Tue Jan 8 22:04:49 UTC 2013 x86_64 x86_64 x86_64
GNU/Linux)
and nginx versions nginx/1.1.19 and nginx/1.5.6 - and the behavior is
consistently same in both versions. The nginx config is identical in all
cases, only difference is the name of the variable in the if statement,
where using $request_uri simply implies uglyID (but only uglyID - not
the
whole matched string) will be url encoded again. Meanwhile I have found
possible workaround using maps, where matching on $request_uri works
correctly and doesn't modify matched data anyhow.

So I wonder, has anyone else experienced this? Is this expected? Why is
the
uglyID (but only uglyID) url encoded again? I would expect either
urlEncoding of the whole match, or none at all - as it doesn't happen
when I
try to match e.g. $uri... Could this indicate a possible bug?

Thanks,

Michael

Posted at Nginx Forum:
http://forum.nginx.org/read.php?2,244573,244573#msg-244573
2974d09ac2541e892966b762aad84943?d=identicon&s=25 mte03 (Guest)
on 2013-11-11 17:02
(Received via mailing list)
Update: I have turned on rewrite log. The rewrite log shows uglyID
correctly
matched and sent to proxy unencoded. However in TCP dump, I can see that
it
was re-encoded again. Also on the target server, I can see the incoming
request having uglyID re-encoded again.

Posted at Nginx Forum:
http://forum.nginx.org/read.php?2,244573,244575#msg-244575
A8108a0961c6087c43cda32c8616dcba?d=identicon&s=25 Maxim Dounin (Guest)
on 2013-11-11 17:04
(Received via mailing list)
Hello!

On Mon, Nov 11, 2013 at 09:51:01AM -0500, mte03 wrote:

>
> consistently same in both versions. The nginx config is identical in all
> cases, only difference is the name of the variable in the if statement,
> where using $request_uri simply implies uglyID (but only uglyID - not the
> whole matched string) will be url encoded again. Meanwhile I have found
> possible workaround using maps, where matching on $request_uri works
> correctly and doesn't modify matched data anyhow.
>
> So I wonder, has anyone else experienced this? Is this expected? Why is the
> uglyID (but only uglyID) url encoded again? I would expect either
> urlEncoding of the whole match, or none at all - as it doesn't happen when I
> try to match e.g. $uri... Could this indicate a possible bug?

Yes, this looks like a bug.  Here is a ticket we already have for
this:

http://trac.nginx.org/nginx/ticket/348

--
Maxim Dounin
http://nginx.org/en/donation.html
2974d09ac2541e892966b762aad84943?d=identicon&s=25 mte03 (Guest)
on 2013-11-11 17:36
(Received via mailing list)
Hi,

I can confirm that using named variables solves the issue (as stated in
the
ticket - maybe you can add my findings (rewrite log) to the ticket
comments,
as I have no rights to do so). Both

if ($request_uri ~ ^/[^\/]+(?<match>/.*)$ ) { set $patch $match; }
and
if ($request_uri ~ ^/[^\/]+(?<path>/.*)$ ) { set $match abc; }

do work correctly. However, from the performance standpoint, is this
solution with IF faster (or more recommended) than doing st. like

map $request_uri $path {
  ~^/[^\/]+(?<param>/.*)$ $param;
}

Is there any way to further improve performance on this kind of
matching?

Thanks,

Michael

Maxim Dounin Wrote:
-------------------------------------------------------
> > this to appID.backend.internal/path1/path2/uglyID. The uglyID is URL
> >
> 3.2.0-36-virtual
> > whole matched string) will be url encoded again. Meanwhile I have
>
> nginx mailing list
> nginx@nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx

Posted at Nginx Forum:
http://forum.nginx.org/read.php?2,244573,244577#msg-244577
A8108a0961c6087c43cda32c8616dcba?d=identicon&s=25 Maxim Dounin (Guest)
on 2013-11-11 18:34
(Received via mailing list)
Hello!

On Mon, Nov 11, 2013 at 11:35:34AM -0500, mte03 wrote:

> Hi,
>
> I can confirm that using named variables solves the issue (as stated in the
> ticket - maybe you can add my findings (rewrite log) to the ticket comments,
> as I have no rights to do so). Both

You actually have rights to do so (though some login is required).
But as the ticket already shows how to reproduce the problem, I
don't think linking a rewrite log will be beneficial.

>
> Is there any way to further improve performance on this kind of matching?

I don't think there is a measurable performance difference.  Use
of map{} might be a bit better as it doesn't involve if(), see
http://wiki.nginx.org/IfIsEvil.

--
Maxim Dounin
http://nginx.org/en/donation.html
Please log in before posting. Registration is free and takes only a minute.
Existing account

NEW: Do you have a Google/GoogleMail, Yahoo or Facebook account? No registration required!
Log in with Google account | Log in with Yahoo account | Log in with Facebook account
No account? Register here.