How to extract domain name without sub domain from url

leakhina · June 23, 2009, 5:46am

Hi everyone,

Does anyone know how to extract domain name without sub domain from url?

Example: http://test.domain.com => http://domain.com

Please give me an example code in ruby.

Thanks,
Leakhina

leakhina · June 23, 2009, 10:11am

Chem Leakhina wrote:

This is actually quite difficult, because there is a multitude of
possible second-level domains which can be used (such as .co.uk), and
they are not really standardized. Just picking one at random, the
country of Jordan has .com.jo, .net.jo, .gov.jo, .edu.jo, .org.jo,
.mil.jo, .name.jo, and .sch.jo.

If one were to ignore such things, then it becomes easier:

$ irb
irb(main):001:0> require ‘uri’
=> true
irb(main):002:0> u = URI.parse “http://test.domain.com/”
=> #<URI::HTTP:0xb7bbf848 URL:http://test.domain.com/>
irb(main):003:0> u.host
=> “test.domain.com”
irb(main):004:0> u.host.split(“.”)[-2,2]
=> [“domain”, “com”]
irb(main):005:0> u.host.split(“.”)[-2,2].join(“.”)
=> “domain.com”

However, as mentioned above, there are a lot of domains this will not
work for.

-Justin

leakhina · June 23, 2009, 11:12am

2009/6/23 Justin C. [email protected]:

Thanks,
$ irb

However, as mentioned above, there are a lot of domains this will not work
for.

We can get better results by ignoring particular known domain prefixes
such as “ftp” and “www”:

this works with 1.8 and 1.9

%w{
www.google.com
google.co.uk
www.google.co.uk
foo.bar
}.each do |domain|
dom = domain.sub(/^(?:www|ftp)./, ‘’)[/^[^.]+/]
printf “%p → %p\n”, domain, dom

alternative

dom = domain[/^(?:(?:ftp|www).)?([^.]+)/, 1]
printf “%p → %p\n”, domain, dom
end

Kind regards

robert