URI.escape and pluses

carl · November 19, 2005, 7:18am

I called URI.escape on a CGI parameter to try to convert any possible
spaces in a user’s query into pluses. However, it converted them into
%20 instead. Is there a different function that I should call that
will convert spaces into pluses, or should I just do a gsub on the
string before calling URI.escape?

Thanks,
Carl

carl · November 19, 2005, 3:39pm

Carl Y. wrote:

I called URI.escape on a CGI parameter to try to convert any possible
spaces in a user’s query into pluses. However, it converted them into
%20 instead. Is there a different function that I should call that
will convert spaces into pluses, or should I just do a gsub on the
string before calling URI.escape?

Is there a special reason you want a + instead of %20? URI.escape works
really like expected: the space sign is %20 and the + sign should be
escaped to %2B.

carl · November 19, 2005, 5:58pm

You know, everything seems to be working. I just thought that space
was supposed to turn into pluses when it was encoded. Sorry to add to
the list traffic.

carl · November 19, 2005, 11:45pm

FYI, %20 and plus are interchangable

http://www.google.com/search?q=foo%20bar
http://www.google.com/search?q=foo+bar

carl · July 7, 2006, 1:12am

kyle wrote:

FYI, %20 and plus are interchangable

As a footnote to this, I wanted to escape a URI I was sending as part
of a CGI query string to the W3C CSS validator. The URI included a “+”
in the path which is normally completely valid, but not when that ends
up in the query string. So, one might expect to be able to do this:

“http://host/path/script.cgi?uri=” + URI.escape(@my_strange_url)

…but if @my_strange_url has “?”, “&” or “+” in it, ‘script.cgi’ will
get mighty confused because they won’t be escaped. I never did find a
satisfactory solution because you can’t, say, call
URI.escape(URI.escape(@my_strange_uri), “?+&”) since the outer
URI.escape call will escape “%” from valid escape sequences already
generated by the inner call, thus breaking the URI.

It isn’t possible to add to or subtract from a regular expression, so
it’s not possible to take URI::REGEXP::UNSAFE and remove the “?”, “+”
and “&” from its exclusion character set. In the end you’re left having
to simply copy the value of URI::REGEXP::UNSAFE and remove the offending
characters because of the unusual use of the escape call, though this
isn’t a nice solution in case any subsequent improvements might be made,
or changes made to the way the escape call works that might invalidate
or at best deprecate the hand-rolled version.

Ruby’s normally good at helping you avoid these kinds of irritating
holdups but in this case one got me! I thought it was worth adding this
note to a rather old thread just in case anyone else with a similar
problem picks it up through the search engine, as I did.