'[' and ']' characters are not valid characters of a URI query component


#1

Do the RFCs and whatnot list those characters as valid in a URI query?

The question comes up because - despite Rails’s joyful abuse of those
characters
to delimit records - some of our params are coming in not like this…

“record[first_name]” => “yo”

…but like this:

"record first_NAME " => “yo”

The raw query has %20 marks for the spaces.

So what’s doing that? And why is the NAME in caps?? So far, the only
thing the
User Agents have in common is “Windows 5.1” and some version of
“Firefox”.


#2

The question comes up because - despite Rails’s joyful abuse of those
characters to delimit records
???


#3

Fernando P. wrote:

The question comes up because - despite Rails’s joyful abuse of those
characters to delimit records

???

An “URI” is an URL. Rails packs records into them like this:

…/controller/action?record[first_name]=norbert&record[last_name]=theNark

The params method unravels them into params[:record] as a convenience.
But does
the industry in general support this use? or is it an artifact of
“browser
forgiveness”?

And has anyone ever seen user-agents convert them to "record first_NAME
" for no
reason?


#4

Philip H. wrote:

Don’t know for sure, but I know that in the late 90’s PHP used [] for
this exact same thing. Still does I would assume. So if it’s browser
forgiveness it’s something that has been going on since at least 1996.

As a shotgun attack, we upgraded our HTML headers from variously
nothing, or
us-ascii, to:

I copied it out of the top of a WikiPedia page, so it doubtless has had
the crap
reviewed out of it…


#5

The params method unravels them into params[:record] as a
convenience. But does
the industry in general support this use? or is it an artifact of
“browser
forgiveness”?

Don’t know for sure, but I know that in the late 90’s PHP used [] for
this exact same thing. Still does I would assume. So if it’s browser
forgiveness it’s something that has been going on since at least 1996.

And has anyone ever seen user-agents convert them to "record
first_NAME " for no
reason?

No.


#6

Hi,

Just wondering why you are taking out the “_”?

Brandon


#7

I copied it out of the top of a WikiPedia page, so it doubtless has had the crap
reviewed out of it…

Also, we are naturally flattening our forms to not use [] in this
context. We
will also take out all the _, but that might just be collateral damage.


#8

Philip H. wrote:

The params method unravels them into params[:record] as a
convenience. But does
the industry in general support this use? or is it an artifact of
“browser
forgiveness”?

Don’t know for sure, but I know that in the late 90’s PHP used [] for
this exact same thing. Still does I would assume. So if it’s browser
forgiveness it’s something that has been going on since at least 1996.

And has anyone ever seen user-agents convert them to "record
first_NAME " for no
reason?

No.

lowalpha = “a” | “b” | “c” | “d” | “e” | “f” | “g” | “h” |
“i” | “j” | “k” | “l” | “m” | “n” | “o” | “p” |
“q” | “r” | “s” | “t” | “u” | “v” | “w” | “x” |
“y” | “z”
hialpha = “A” | “B” | “C” | “D” | “E” | “F” | “G” | “H” | “I” |
“J” | “K” | “L” | “M” | “N” | “O” | “P” | “Q” | “R” |
“S” | “T” | “U” | “V” | “W” | “X” | “Y” | “Z”

alpha = lowalpha | hialpha
digit = “0” | “1” | “2” | “3” | “4” | “5” | “6” | “7” |
“8” | “9”
safe = “$” | “-” | “_” | “.” | “+”
extra = “!” | “*” | “’” | “(” | “)” | “,”
national = “{” | “}” | “|” | “” | “^” | “~” | “[” | “]” | “`”
punctuation = “<” | “>” | “#” | “%” | <">

reserved = “;” | “/” | “?” | “:” | “@” | “&” | “=”
hex = digit | “A” | “B” | “C” | “D” | “E” | “F” |
“a” | “b” | “c” | “d” | “e” | “f”
escape = “%” hex hex

Judging from the fact the “~” is legal (and used) in URLs and “[” and
“]” are also listed under “national” I’d say that Rails use of those
characters (unescaped) is probably just fine for most cases. Although
they are not listed under “safe” so there might be some obscure edge
cases that might present an issue. But, I doubt there’s any reason to
worry about it.


#9

Phlip wrote:
[…]

An “URI” is an URL. Rails packs records into them like this:

…/controller/action?record[first_name]=norbert&record[last_name]=theNark

I don’t use GET with [] very often in Rails, but when I have done so, I
have noticed that the [] are always URL-encoded as %xx (don’t remember
the number offhand). Either [] are illegal in URLs or the browser is
playing it safe.

Could you use POST or custom routing in this case?

The params method unravels them into params[:record] as a convenience.
But does
the industry in general support this use? or is it an artifact of
“browser
forgiveness”?

PHP does it, as others have pointed out.

And has anyone ever seen user-agents convert them to "record first_NAME
" for no
reason?

Surely not! That’s bizarre!

A thought: I reported a weird bug with nested params[] back in January
or so (check this list or Lighthouse). Fred patched it, but I don’t
know when the patch made it into core, if it ever did. Perhaps this is
related?

Best,

Marnen Laibow-Koser
removed_email_address@domain.invalid
http://www.marnen.org


#10

The RFC 3986, warns about these “[”,"]" characters, but leaves it up
to the implementor. Firefox, IE and Safari browsers support these
characters.
http://tools.ietf.org/html/rfc3986

"Special care should be taken when the URI path interpretation process
involves the use of a back-end file system or related system
functions. File systems typically assign an operational meaning to
special characters, such as the “/”, “”, “:”, “[”, and “]”
characters, and to special device names like “.”, “…”, “…”,
“aux”,
“lpt”, etc. In some cases, merely testing for the existence of
such
a name will cause the operating system to pause or invoke unrelated
system calls, leading to significant security concerns regarding
denial of service and unintended data transfer. "

The now, obsolete http://www.ietf.org/rfc/rfc2396.txt, had them listed
as unwise characters
" Other characters are excluded because gateways and other transport
agents are known to sometimes modify such characters, or they are
used as delimiters.

unwise = “{” | “}” | “|” | “” | “^” | “[” | “]” | “`”

Data corresponding to excluded characters must be escaped in order
to
be properly represented within a URI.
"


#11

Brandon O. wrote:

Just wondering why you are taking out the “_”?

Some commie or terr’ist somewhere has a browser that replaced
peace_freedom with
peaceUS95freedom. That might have been caused by the accidental
us-ascii
encoding on the page - 95 is the ASCII code point for _ - but it still
spooked
our Onsite Customer, so away with it.