Hello Friends,
I need to write a regular expression which will extract and return the
domain name.
for example
if a user parse any of the below mention url it should save only
“foo.com”
http://www.foo.com/
http://www.foo.com/something
http://foo.com/
https://something.foo.com/
Thanks for any help…
Thanks
abhis
Good way to Start is trying it to learn on online Regular Expression
Editor
http://rubular.com
Hey srinivas,
Thanks for reply.
Somehow I am able to get the outpout, but the only problem is that i
have to
define all the uk|com|net|org|in
So just trying to figure out which will be the best way to get the
output.
url_pattern =
/^(?:.+?.)+(.+?.(?:co.uk|com|net|org|in))(:[0-9]{2,5})?/.$/is
url = “http://www.foo.com”
url_pattern.match(url)
$1 #=> “foo.com”
Thanks
Abhishek
On Wed, Nov 11, 2009 at 10:25 PM, Abhishek shukla
[email protected]wrote:
https://something.foo.com/
Thanks for any help…
Thanks
abhis
require ‘uri’
urls = [ “http://www.foo.com/”, “http://www.foo.com/something”, "
http://foo.com/", “https://something.foo.com/” ]
urls.each { |url| puts URI::parse( url ).host.split( “.”
)[-2,2].join(“.”) }
Good luck,
-Conrad
Hi Abhishek
You can try using Addressable gem for your requirement .
Step 1 : Install Addressable gem with the following command .
$sudo gem install addressable
Step 2 : Will be explaining with IRB u can try and integrate with
your rails application .
$ irb
> require 'rubygems'
> require 'addressable/uri'
> uri = Addressable::URI.parse("http://google.com")
=> #<Addressable::URI:0xfdb9aee5c
URI:http://google.com>
Step 3 : You can extract only the host with the following command
> uri.host
=> "google.com"
There are many other different options which you can explore
http://addressable.rubyforge.org/api/classes/Addressable/URI.html
Hope this helps !
Best regards,
Srinivas I.
http://twitter.com/srinivasiyermv
On Thu, 2009-11-12 at 02:31 -0800, Conrad T. wrote:
irb(main):002:0> require ‘addressable/uri’
=> true
irb(main):003:0> uri =
Addressable::URI.parse(“http://www.usc.edu/home.html” )
=> #<Addressable::URI:0x90e89c URI:http://www.usc.edu/home.html>
irb(main):004:0> uri.host
=> “www.usc.edu”
uri.host.split(‘.’)[0]
Craig
–
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
Given a URL, return a domain
def self.url_to_domain(url)
begin
host = URI.parse(self.fix_url(url)).host
host.gsub(/\Awww./, “”)
rescue
“”
end
end
On Wed, Nov 11, 2009 at 11:46 PM, Srinivas I. [email protected]
wrote:
your rails application .
=> “google.com”
There are many other different options which you can explore
http://addressable.rubyforge.org/api/classes/Addressable/URI.html
Hope this helps !
Best regards,
Srinivas I.
http://talkonsomething.com
http://twitter.com/srinivasiyermv
Hi, the addressable gem doesn’t produce the domain part of the web
address.
For example,
irb(main):002:0> require ‘addressable/uri’
=> true
irb(main):003:0> uri =
Addressable::URI.parse(“http://www.usc.edu/home.html”
)
=> #<Addressable::URI:0x90e89c URI:http://www.usc.edu/home.html>
irb(main):004:0> uri.host
=> “www.usc.edu”
-Conrad
Oops, forgot to add the other function i was using:
Prepend URL with http if necessary
def self.fix_url(u)
!!( u !~ /\A(?:http://|https://)/i ) ? “http://#{u}” : u
end
Note that you need to require uri:
require ‘uri’
I put this in a module called Utilities so the whole thing is:
require ‘uri’
module Utilities
Given a URL, return a domain
def self.url_to_domain(url)
begin
host = URI.parse(self.fix_url(url)).host
host.gsub(/\Awww./, “”)
rescue
“”
end
end
Prepend URL with http if necessary
def self.fix_url(u)
!!( u !~ /\A(?:http://|https://)/i ) ? “http://#{u}” : u
end
end
And you call it with Utilities::url_to_domain(u)