On to my next learning exercise. As I parse a file I need to pull an IP
address out a line. Now I thought a regular expression would be the
ticket, but it’s giving me a problem. The follow line is an example
string I need to pull one of two IP address out of: (they are not
always formed the same)
I need that last IP address. Now the problem is that I can’t always
count on it being enclosed in paren’s. Although I can expect the right
paren to always be there.
So here’s my regex exp:
sourceip = line.scan(/\b(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})\b/)
And as you might expect, it is pulling both IP addresses. Is there a
way I can adjust the expression to grab the second IP testing for the
“)” or is there another method I can use? Short of dissecting the
entire string backwards and testing whether I have a number or a char,
decimal and at most 3 chars from it, etc?
count on it being enclosed in paren’s. Although I can expect the right
paren to always be there.
So here’s my regex exp:
sourceip = line.scan(/\b(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})\b/)
And as you might expect, it is pulling both IP addresses. Is there a
way I can adjust the expression to grab the second IP testing for the
“)” or is there another method I can use? Short of dissecting the
entire string backwards and testing whether I have a number or a char,
decimal and at most 3 chars from it, etc?
What you want is an IP address, possibly followed by ‘)’ and
definitely coming at the end of the string (give or take a newline
character after it). That can be expressed like this:
/((\d{1,3}.){3}\d{1,3})(?=)?\Z)/
I’ve got 3 occurences of (\d{1,3}.), followed by the same thing
without a dot. I’ve stipulated that this submatch be “looking at”
(i.e., positioned just before) an optional ‘)’ followed by the end of
the string. (\Z gives you end of string, ignoring a possible terminal
newline.)
With your line, it gives you:
irb(main):039:0> line[re] # re.match(line)[0], or whatever
=> “222.222.222.22”
count on it being enclosed in paren’s. Although I can expect the right
paren to always be there.
So here’s my regex exp:
sourceip = line.scan(/\b(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})\b/)
And as you might expect, it is pulling both IP addresses. Is there a
way I can adjust the expression to grab the second IP testing for the
“)” or is there another method I can use? Short of dissecting the
entire string backwards and testing whether I have a number or a char,
decimal and at most 3 chars from it, etc?
What you want is an IP address, possibly followed by ‘)’ and
definitely coming at the end of the string (give or take a newline
character after it). That can be expressed like this:
/((\d{1,3}.){3}\d{1,3})(?=)?\Z)/
I’ve got 3 occurences of (\d{1,3}.), followed by the same thing
without a dot. I’ve stipulated that this submatch be “looking at”
(i.e., positioned just before) an optional ‘)’ followed by the end of
the string. (\Z gives you end of string, ignoring a possible terminal
newline.)
With your line, it gives you:
irb(main):039:0> line[re] # re.match(line)[0], or whatever
=> “222.222.222.22”
Another possibility (if I understood correctly): check for the
mandatory ‘)’ in the HELO part, followed by any character (could be
changed by the specific spaces), followed by the numbers and dots for
the IP:
a = “Received: from mmds-111-19-22-30.twm.ca.internet.net (HELO
?192.168.1.2?) (222.222.222.22)”
a.match(/).*?(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})/)[1]
Another possibility (if I understood correctly): check for the
mandatory ‘)’ in the HELO part, followed by any character (could be
changed by the specific spaces), followed by the numbers and dots for
the IP:
a = “Received: from mmds-111-19-22-30.twm.ca.internet.net (HELO
?192.168.1.2?) (222.222.222.22)”
a.match(/).*?(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})/)[1]
gives: “222.222.222.22”
Jesus.
Thanks Jesus,
I appraciate your imput as well. You guys have been a great deal of
help.