Double quote in url causes URI.parse to raise URI::InvalidURIError

Detlef_R · December 20, 2013, 2:22am

I’m not sure if this is desired behavior or not, but when I pass a url
with double quotes in it to URI.parse, it rasies URI::InvalidURIError
(bad URI(is not URI?). It appears that (at least some) browsers load a
website that has double quotes in the url.

Is this error raised for a particular reason (e.g. specification of url
excludes use of double quotes)? Or is it safe for me to ignore this
error?

davidjrice · December 20, 2013, 2:44am

On Thu, Dec 19, 2013 at 5:22 PM, David R. [email protected]
wrote:

I’m not sure if this is desired behavior or not, but when I pass a url
with double quotes in it to URI.parse, it rasies URI::InvalidURIError
(bad URI(is not URI?).

Is this error raised for a particular reason (e.g. specification of url
excludes use of double quotes)?

Recommended reading: http://www.ietf.org/rfc/rfc1738.txt

davidjrice · December 20, 2013, 3:01am

On 20 December 2013 11:44, Hassan S.
[email protected]wrote:

Recommended reading: http://www.ietf.org/rfc/rfc1738.txt

Technically RFC 1738 is obsolete, and in either case the syntax rules
are
updated by RFC 3986. You should check out
RFC 3986 - Uniform Resource Identifier (URI): Generic Syntax , particularly Section 3 and Appendix
A.

davidjrice · December 20, 2013, 3:58am

On Thu, Dec 19, 2013 at 6:00 PM, Matthew K. [email protected]
wrote:

Technically RFC 1738 is obsolete,

Per RFC 3986:

Updates: 1738
Obsoletes: 2732, 2396, 1808

If someone hasn’t (apparently) read the standards that govern the
context in which they’re working, I always suggest starting at the
beginning

YMMV,

davidjrice · December 21, 2013, 5:12am

On Thu, Dec 19, 2013 at 7:22 PM, David R. [email protected]
wrote:

I’m not sure if this is desired behavior or not, but when I pass a url
with double quotes in it to URI.parse, it rasies URI::InvalidURIError
(bad URI(is not URI?). It appears that (at least some) browsers load a
website that has double quotes in the url.

Is this error raised for a particular reason (e.g. specification of url
excludes use of double quotes)? Or is it safe for me to ignore this
error?

Others have shown the RFCs for URLs. What isn’t altogether clear is that
browsers quite often present URLs different than they process them
internally (more human-readable sort of thing, etc). While it may look
as
though your browser is handling a double quote, it is very likely not
the
case when it makes the request and receives the response.

The URI module is quite precise, and is not attempting to do anything in
a
human-readable fashion, but handle URIs in the way the network protocols
need them.

davidjrice · December 20, 2013, 4:08am

Definitely understandable. I’ve looked at 1738, and see that (at that
time, at least), double quotation marks were considered unsafe. I’ll
take a look at 3986.

davidjrice · December 23, 2013, 5:15pm

Thanks, tamouse_m, that’s really helpful.