I’m querying through the .com’s looking for any www name with Cisco in
it.
I’m using a reg exp that is reading a file line by line obtained by
Verisign that has all the domain names with a .com extension.
Here is my reg exp:
file.each { |line| print line if line =~ /(C|c)isco|CISCO/ }
but I’m getting results like SanFrancisco and Francisco.
Does anyone know how I can modify my reg exp to not include certain
keywords like SanFrancisco and Francisco?
-Chuck
On Sep 7, 2007, at 12:58 AM, Charles P. wrote:
Does anyone know how I can modify my reg exp to not include certain
keywords like SanFrancisco and Francisco?
You need to assert word boundaries:
/\b((C|c)isco|CISCO)\b/
– fxn
Hi
I don’t know if you need to check the caps, but your regex could be
/cisco/i
(the i is for case insensitive)
To “ban” certain words, you’ could use lookaround expression -and I
don’t
know if Ruby supports it-, but I personally find cleaner check
programatically against a list, and add the “exceptions” to that list.
Hope it helps 
Diego.
The boundaries could be a good idea, but if he needs urls like
ciscosystems.com, with the final \b won’t match. Keep that in sight 
2007/9/7, Diego S. [email protected]:
How about:
/\b([Cc]isco|CISCO)\b/
or even
/\bcisco\b/i
I’d probably use a second rx like
file.each { |line| print line if line =~ /\bcisco/i && line !~
/sanfrancisco/i}
Kind regards
robert
On Fri, 7 Sep 2007, Charles P. wrote:
file.each { |line| print line if line =~ /(C|c)isco|CISCO/ }
…
Does anyone know how I can modify my reg exp to not include
certain keywords like SanFrancisco and Francisco?
How about:
/\b([Cc]isco|CISCO)\b/
or even
/\bcisco\b/i
?
Posted by Charles P. (chuckdawit) on 07.09.2007 00:58
I’m querying through the .com’s looking for any www name with Cisco in it.
I’m using a reg exp that is reading a file line by line obtained by
Verisign that has all the domain names with a .com extension.
Here is my reg exp:
file.each { |line| print line if line =~ /(C|c)isco|CISCO/ }
but I’m getting results like SanFrancisco and Francisco.
Does anyone know how I can modify my reg exp to not include certain
keywords like SanFrancisco and Francisco?
-Chuck
Reply with quote
Maybe also try something like: print line if line =~
/^(?!.(?:SanFrancisco|Francisco)).\Bcisco/i
Cheers
j.k.
On Sep 6, 4:58 pm, Charles P. [email protected] wrote:
Does anyone know how I can modify my reg exp to not include certain
keywords like SanFrancisco and Francisco?
Instead of modifying the regex, how about simply working on the data
until it’s right?
ACCEPTABLE = [ /francisco/i, /scisco/i ]
matches = file.readlines.select{ |line| line =~ /[Cc]isco|CISCO/ }
ACCEPTABLE.each{ |re| matches.delete_if{ |line| line =~ re } }
puts matches
Gavin K. wrote:
On Sep 6, 4:58 pm, Charles P. [email protected] wrote:
Does anyone know how I can modify my reg exp to not include certain
keywords like SanFrancisco and Francisco?
Instead of modifying the regex, how about simply working on the data
until it’s right?
ACCEPTABLE = [ /francisco/i, /scisco/i ]
matches = file.readlines.select{ |line| line =~ /[Cc]isco|CISCO/ }
ACCEPTABLE.each{ |re| matches.delete_if{ |line| line =~ re } }
puts matches
If I want to print the line to a new file I use this:
dnsfile.each { |line| newfile.puts line if line =~
/(\b((C|c)isco|CISCO)\b) /
what can I do if I want to print it to a newfile and print it to the
screen at the same time?