Path components interpretation by nginx

Hello,

While doing an audit for a client I came across an URL of the from:

http://host/foobar;arg=quux?q=en/somewhere&a=1&b=2

Now doing something like:

location /test-args {
return 200 “u: $uri\nq: $query_string\na: $args\n”;
}

This returns as the value of $uri the string foobar;arg=quux, i.e., the
first parameter arg=quux is not being interpreted as an argument but as
part of the URI.

This is confirmed by changing the location to be exact using =
/test-args
in which case nginx cannot find a configuration for handling the
request.

Now if I understand correctly section 3.3 of the RFC

The path may consist of a sequence of path segments separated by a
single slash “/” character. Within a path segment, the characters
“/”, “;”, “=”, and “?” are reserved. Each path segment may include a
sequence of parameters, indicated by the semicolon “;” character.
The parameters are not significant to the parsing of relative
references.

Which means that the above URL is perfectly legal with arg being
considered
a parameter.

Shouldn’t nginx interpret arg=quux as an argument and not part of the
URI
in order to fully support the RFC in question?

Thank you,
----appa

3.3 Path…

End of para 1.

“The path is terminated by the first question mark (”?“) or number sign
(”#“) character, or by the end of the URI.”

although I think most web servers add & to ?.

Steve

On Wed, 2014-02-12 at 02:07 +0100, António P. P. Almeida wrote:

    separated by a

nginx mailing list
[email protected]
nginx Info Page


Steve H. BSc(Hons) MIITP

Linkedin: http://www.linkedin.com/in/steveholdoway
Skype: sholdowa

Hello!

On Wed, Feb 12, 2014 at 02:07:50AM +0100, António P. P. Almeida wrote:

}

Shouldn’t nginx interpret arg=quux as an argument and not part of the URI
in order to fully support the RFC in question?

I don’t see any incompatibilities with RFC in current nginx
behaviour. Parameters aren’t significant to the parsing of
relative references, much like RFC states - i.e., “…/foo” from
both “/bar;param/bazz” and “/bar/bazz” will result in the same
URI.

Parameters are not query string though. Note that semantically
parameters are for a path segment, and something like
“/foo;v=1.1/bar;v=1.2/bazz” indicates a reference to version 1.1
of foo, and version 1.2 of bar. Representing parameters as a part
of the query string will be just wrong.

Current nginx behaviour is to treat parameters as a part of a path
segment, which is believed to be compliant behaviour.


Maxim D.
http://nginx.org/

Hello!

On Wed, Feb 12, 2014 at 02:16:29PM +0100, António P. P. Almeida wrote:

as a parameter:

/api;jsessionid=somehash?t=1&q=2

obviously they have some issues with their API well beyond merely being non
RFC 3986
compliant :slight_smile:

I don’t think that passing session id as a path segment parameter
is wrong per se. One can think of it as ““api” path segment,
version for a session specified”, and it should work as long as
it’s properly handled by the server which implements the API. But
it may be non-trivial to work with such URIs in various software,
including nginx, as path segment parameters support is usually
quite limited.


Maxim D.
http://nginx.org/

Hello Maxim,

Thank you. In fact since I never saw this type of URI before on an API I
thought that
trying to use the path segment parameters as a query string argument was
borderline
RFC compliant.

The original API I was referring to uses the parameter as an argument
since
they pass a session token
as a parameter:

/api;jsessionid=somehash?t=1&q=2

obviously they have some issues with their API well beyond merely being
non
RFC 3986
compliant :slight_smile:

Thanks,

----appa

This means that if relying solely on nginx we need multiple regexes to
extract the parameters
(we need to match on both the unescaped and escaped characters)
or using Lua we can unescape and do string processing using the Lua
libraries to extract the parameters.

Correct?

----appa