Hello,
I try to find a regex which find only the domain name without the .com
or .nl
I tried :
\w{3+}
But on http://www.tamarawobben.nl/testhier it finds http tamarawobben
and testhier
How to solve this ?
Roelof
Hello,
I try to find a regex which find only the domain name without the .com
or .nl
I tried :
\w{3+}
But on http://www.tamarawobben.nl/testhier it finds http tamarawobben
and testhier
How to solve this ?
Roelof
On 14-06-02, 10:51, Roelof W. wrote:
I try to find a regex which find only the domain name without the .com
or .nlI tried :
\w{3+}
You need to limit your expression more, your example will find ANY 3
“word” characters, even numbers and punctuation, and even if there are
MORE word characters around them.
You could do this with a capture group (here, named “tld”):
regex = %r{
\. # one dot,
(?<tld> # capture as "tld":
[a-z]{3}+ # 3+ alpha characters (note: not \w)
)
$ # at the end of a line/string
}x
"http://example.com".match(regex)[:tld]
Or with a positive look-behind:
%r{
(?<=\.) # lookbehind for one dot,
[a-z]{3}+ # match 3+ alpha characters
$ # at the end of a line/string
}
"http://example.com".match(regex)
Another approach is using the URI library:
URI.parse("http://example.com/").host.split(".").last
Andrew V.
rubular.com is a great site for testing regexes. Here is one for the
last
regex given by Andrew V.:
Good luck
Andrew V. schreef op 2-6-2014 21:18:
regex =
Thanks,
But if I try all three on a online ruby intepreter they do not give any
answer.
Roelof
On Mon, Jun 2, 2014 at 7:51 PM, Roelof W. [email protected] wrote:
How to solve this ?
That depends on your input. Do you want to find those domain names in
a larger text? Do you try to parse URIs? Do you have full qualified
domain names from which you want to extract a portion?
Kind regards
robert
On Mon, Jun 2, 2014 at 2:18 PM, Andrew V. [email protected] wrote:
You need to limit your expression more, your example will find ANY 3
$ # at the end of a line/string
“http://example.com”.match(regex)Another approach is using the URI library:
URI.parse("http://example.com/").host.split(".").lastAndrew V.
OP seems to want the domain name without the TLD or subdomains. Andrew’s
URI solution actually seems quite the best (really no sense in rewriting
well-written regexps), but instead of the last part of the host, you’ll
want the penultimate part. Several ways you can get that. Here’s one:
URI.parse("http://www.tamarawobben.nl/testhier").host.split(".")[-2]
#=>
“tamarawobben”
Roelof W. schreef op 3-6-2014 8:55:
When I do (<?=/[.|/]) or (<?=/[.|//]) I see a message that I have to
excape the /Roelof
I tried this one (?<=[.|//)(.*?)(?=.)
but still the error message taht there are un escaped backslashes .
Roelof
Robert K. schreef op 3-6-2014 8:41:
testhier
How to solve this ?
That depends on your input. Do you want to find those domain names in
a larger text? Do you try to parse URIs? Do you have full qualified
domain names from which you want to extract a portion?Kind regards
robert
Im a little bit further.
I have this : (?<=.)(.*?)(?=.)
it seems to work except I have to tell that on the .*? the / is not
included.
And on the (<?=/.) I have to find a way to include the //
When I do (<?=/[.|/]) or (<?=/[.|//]) I see a message that I have to
excape the /
Roelof
On Tue, Jun 3, 2014 at 10:11 AM, Roelof W. [email protected] wrote:
Roelof W. schreef op 3-6-2014 8:55:
Robert K. schreef op 3-6-2014 8:41:
On Mon, Jun 2, 2014 at 7:51 PM, Roelof W. [email protected] wrote:
…
I tried this one (?<=[.|//)(.*?)(?=.)
but still the error message taht there are un escaped backslashes .
Please stop fullquoting - especially if you are not referring in any
way to the quoted text. Thank you.
Regards
robert
On 14-06-02, 23:41, Robert K. wrote:
That depends on your input. Do you want to find those domain names in
a larger text? Do you try to parse URIs? Do you have full qualified
domain names from which you want to extract a portion?
URI can extract from larger texts (URI.extract), parse URIs (URI.parse),
and after that it’s easy to split the domain parts from the
fully-qualified hostnames. Really, I don’t think there’s any point in
reinventing this using a Regexp… unless it’s just a learning exercise.
Andrew V.
On Tue, Jun 3, 2014 at 6:50 PM, Roelof W. [email protected] wrote:
But I think I will use a regex for finding the full domain and then use
http://.tamarawobben.nl/index.htmlwhere all three tamarawobben.nl must be found.
Roelof
I tried to do it with a single regexp and I couldn’t do anything
useful, so I tried to do it first with a regexp to extract the part
between the slashes (between http:// and the following /) and then use
split on “.” to the result. This way is quite simpler. I’m not going
to give you the solution, so you can try a little bit this approach,
as this is a learning exercise.
Let me know if you get stuck.
Jesus.
xmlns=“XHTML namespace”>
Op 3 juni 2014 om 18:36 schreef Andrew V. <[email protected]>:
On 14-06-02, 23:41, Robert Klemme wrote:
> That depends on your input. Do you want to find those domain names in
> a larger text? Do you try to parse URIs? Do you have full qualified
> domain names from which you want to extract a portion?
URI can extract from larger texts (URI.extract), parse URIs (URI.parse),
and after that it's easy to split the domain parts from the
fully-qualified hostnames. Really, I don't think there's any point in
reinventing this using a Regexp... unless it's just a learning exercise.
Andrew V.
This is a learning exercise from codewars.
But I think I will use a regex for finding the full domain and then use split to find only the part before the .com and so on.
I tried and I think its very difficult to find a regex which can solve all these problems.
http:///www.tamarawobben.nl/index.html
http://tamarawobben.nl/index.html
http://<subdomain>.tamarawobben.nl/index.html
where all three tamarawobben.nl must be found.
Roelof
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.
Sponsor our Newsletter | Privacy Policy | Terms of Service | Remote Ruby Jobs