URI.parse and whitespace characters

arvias · April 13, 2009, 12:54pm

Hi,

I have to parse a lot of bad links like:
http://www.something.com/bad link.jpg
(spaces in them)

URI.parse fails parsing them and considers them as faulty links.

However in my case I need it to work. Is there a workaround for this?

arvias · April 13, 2009, 1:03pm

That’s not necessarily a bad url, you just need to encode it properly, i
believe %20 is the code for a space.
Google gives me this summary;
http://www.blooberry.com/indexdot/html/topics/urlencoding.htm

arvias · April 13, 2009, 1:07pm

On Mon, Apr 13, 2009 at 12:54 PM, Marcelo B. [email protected]
wrote:

–
M.

uri = “http://www.something.com/bad link.jpg”
URI.parse(uri.gsub(/ /, ‘+’)) #<- replaces all spaces with ‘+’

Andrew T.
http://ramblingsonrails.com
http://www.linkedin.com/in/andrewtimberlake

“I have never let my schooling interfere with my education” - Mark Twain

arvias · April 13, 2009, 1:34pm

Thanks for both the answers.

CGI.escape is my friend:)

arvias · April 14, 2009, 12:24am

On Mon, Apr 13, 2009 at 12:34 PM, Marcelo B. [email protected]
wrote:

Thanks for both the answers.

CGI.escape is my friend:)

–
M.

Just a small point - URI escaping and CGI escaping are similar but not
the same. To get URI escaping, require ‘uri’ and use URI.escape
instead of CGI.escape.

Regards,
Sean