URI.decode_www_form errors on valid x-www-form-urlencoded strings

When you call URI.decode_www_form on the following:

URI.decode_www_form(“a=1?b=2=”)

it throws an error, presumably because of the trailing “=”. But after
reading

 http://www.w3.org/TR/html5/forms.html#url-encoded-form-data

I believe that is is a valid application/x-www-form-urlencoded string –
I see nothing in the spec that precludes having multiple “=” in the
string – only the first one is interpreted as the name/value separator.

Is decode_www_form being overly strict? Or am I mis-interpreting the
spec? (And should I take this discussion to ruby-core or
bugs.ruby-lang.org?)

On 11/22/2013 03:17 PM, Fearless F. wrote:

I see nothing in the spec that precludes having multiple “=” in the
string – only the first one is interpreted as the name/value separator.

Is decode_www_form being overly strict? Or am I mis-interpreting the
spec? (And should I take this discussion to ruby-core or
bugs.ruby-lang.org?)

I assume you meant

 URI.decode_www_form("a=1&b=2=")

While the standard does not appear to explicitly prohibit this (I
believe it could be interpreted as [“b”, “2=”]), it’s also not an input
which could be generated from the encoding algorithm.

FWIW, the Ruby implementation of this method explicitly disallows “=” in
the name or value.

-Justin

Justin C. wrote in post #1128365:

I assume you meant

 URI.decode_www_form("a=1&b=2=")

Yes, thanks for the correction.

While the standard does not appear to explicitly prohibit this (I
believe it could be interpreted as [“b”, “2=”]), it’s also not an input
which could be generated from the encoding algorithm.

FWIW, the Ruby implementation of this method explicitly disallows “=” in
the name or value.

I concur with all of the above. But the clause in the w3.org spec that
got my attention is:

If string contains a “=” (U+003D) character, then let name be the
substring of string from the start of string up to but excluding its
first “=” (U+003D) character, and let value be the substring from
the first character, if any, after the first “=” (U+003D) character
up to the end of string. If the first “=” (U+003D) character is the
first character, then name will be the empty string. If it is the last
character, then value will be the empty string.

Note that it mentions ‘the first “=”’ consistently, which STRONGLY
suggests that subsequent “=” are allowed as part of the value string.

(I’ve entered this maze of twisty little passages because I’m receiving
a query string from an external host that contains an “extra” equal sign
and I’m trying to decide whose bug this is. I may be tilting at
windmills in questioning the Ruby implementation, but it sure seems like
additional equal signs are permitted by the spec.)