FYI, %20 and plus are interchangable
As a footnote to this, I wanted to escape a URI I was sending as part
of a CGI query string to the W3C CSS validator. The URI included a “+”
in the path which is normally completely valid, but not when that ends
up in the query string. So, one might expect to be able to do this:
“http://host/path/script.cgi?uri=” + URI.escape(@my_strange_url)
…but if @my_strange_url has “?”, “&” or “+” in it, ‘script.cgi’ will
get mighty confused because they won’t be escaped. I never did find a
satisfactory solution because you can’t, say, call
URI.escape(URI.escape(@my_strange_uri), “?+&”) since the outer
URI.escape call will escape “%” from valid escape sequences already
generated by the inner call, thus breaking the URI.
It isn’t possible to add to or subtract from a regular expression, so
it’s not possible to take URI::REGEXP::UNSAFE and remove the “?”, “+”
and “&” from its exclusion character set. In the end you’re left having
to simply copy the value of URI::REGEXP::UNSAFE and remove the offending
characters because of the unusual use of the escape call, though this
isn’t a nice solution in case any subsequent improvements might be made,
or changes made to the way the escape call works that might invalidate
or at best deprecate the hand-rolled version.
Ruby’s normally good at helping you avoid these kinds of irritating
holdups but in this case one got me! I thought it was worth adding this
note to a rather old thread just in case anyone else with a similar
problem picks it up through the search engine, as I did.