_ in URI subdomain problem

Hi,

I am currently working with some code for fetching webpages, and I have
run into a problem. The current implementation does not fetch webpages
with _ in the subdomain, for ex http://a_b.google.com.

I have poked around the forum posts and read that the _ in the subdomain
violates an RFC standand, but in my case it is necessary to retrieve
those pages regardless. Before I dive a bit more into this code that I
inherited, has anyone successfully retrieved such pages?

The code uses URI.parse for URI parsing and Net::HTTP for page
retrieval. Currently the code breaks at the URI.parse. Will it suffice
just to rewrite the URI.parse or do I need to find an alternative to
Net::HTTP as well?

Thanks in advance.

The code uses URI.parse for URI parsing and Net::HTTP for page
retrieval. Currently the code breaks at the URI.parse. Will it suffice
just to rewrite the URI.parse or do I need to find an alternative to
Net::HTTP as well?

Should be able to extend URI.parse and have it work there. Good luck!

I think I found the answer in URI.escape after discovering the goldmine
that is the searchable Ruby list.

http://blade.nagaokaut.ac.jp/ruby/ruby-talk/index.shtml#ruby_talk

Aredridel wrote:

The code uses URI.parse for URI parsing and Net::HTTP for page
retrieval. Currently the code breaks at the URI.parse. Will it suffice
just to rewrite the URI.parse or do I need to find an alternative to
Net::HTTP as well?

Should be able to extend URI.parse and have it work there. Good luck!

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs