URI.parse and whitespace characters


#1

Hi,

I have to parse a lot of bad links like:
http://www.something.com/bad link.jpg
(spaces in them)

URI.parse fails parsing them and considers them as faulty links.

However in my case I need it to work. Is there a workaround for this?


#2

That’s not necessarily a bad url, you just need to encode it properly, i
believe %20 is the code for a space.
Google gives me this summary;
http://www.blooberry.com/indexdot/html/topics/urlencoding.htm


#3

On Mon, Apr 13, 2009 at 12:54 PM, Marcelo B. removed_email_address@domain.invalid
wrote:


M.

uri = “http://www.something.com/bad link.jpg”
URI.parse(uri.gsub(/ /, ‘+’)) #<- replaces all spaces with ‘+’

Andrew T.
http://ramblingsonrails.com
http://www.linkedin.com/in/andrewtimberlake

“I have never let my schooling interfere with my education” - Mark Twain


#4

Thanks for both the answers.

CGI.escape is my friend:)


#5

On Mon, Apr 13, 2009 at 12:34 PM, Marcelo B. removed_email_address@domain.invalid
wrote:

Thanks for both the answers.

CGI.escape is my friend:)


M.

Just a small point - URI escaping and CGI escaping are similar but not
the same. To get URI escaping, require ‘uri’ and use URI.escape
instead of CGI.escape.

Regards,
Sean