Regular Expression help

On to my next learning exercise. As I parse a file I need to pull an IP
address out a line. Now I thought a regular expression would be the
ticket, but it’s giving me a problem. The follow line is an example
string I need to pull one of two IP address out of: (they are not
always formed the same)

Received: from mmds-111-19-22-30.twm.ca.internet.net (HELO
?192.168.1.2?) (222.222.222.22)

I need that last IP address. Now the problem is that I can’t always
count on it being enclosed in paren’s. Although I can expect the right
paren to always be there.

So here’s my regex exp:
sourceip = line.scan(/\b(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})\b/)
And as you might expect, it is pulling both IP addresses. Is there a
way I can adjust the expression to grab the second IP testing for the
“)” or is there another method I can use? Short of dissecting the
entire string backwards and testing whether I have a number or a char,
decimal and at most 3 chars from it, etc?

tonyd

Hi –

On Sat, 29 Mar 2008, Tony De wrote:

count on it being enclosed in paren’s. Although I can expect the right
paren to always be there.

So here’s my regex exp:
sourceip = line.scan(/\b(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})\b/)
And as you might expect, it is pulling both IP addresses. Is there a
way I can adjust the expression to grab the second IP testing for the
“)” or is there another method I can use? Short of dissecting the
entire string backwards and testing whether I have a number or a char,
decimal and at most 3 chars from it, etc?

What you want is an IP address, possibly followed by ‘)’ and
definitely coming at the end of the string (give or take a newline
character after it). That can be expressed like this:

/((\d{1,3}.){3}\d{1,3})(?=)?\Z)/

I’ve got 3 occurences of (\d{1,3}.), followed by the same thing
without a dot. I’ve stipulated that this submatch be “looking at”
(i.e., positioned just before) an optional ‘)’ followed by the end of
the string. (\Z gives you end of string, ignoring a possible terminal
newline.)

With your line, it gives you:

irb(main):039:0> line[re] # re.match(line)[0], or whatever
=> “222.222.222.22”

David

David A. Black wrote:

Hi –

On Sat, 29 Mar 2008, Tony De wrote:

count on it being enclosed in paren’s. Although I can expect the right
paren to always be there.

So here’s my regex exp:
sourceip = line.scan(/\b(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})\b/)
And as you might expect, it is pulling both IP addresses. Is there a
way I can adjust the expression to grab the second IP testing for the
“)” or is there another method I can use? Short of dissecting the
entire string backwards and testing whether I have a number or a char,
decimal and at most 3 chars from it, etc?

What you want is an IP address, possibly followed by ‘)’ and
definitely coming at the end of the string (give or take a newline
character after it). That can be expressed like this:

/((\d{1,3}.){3}\d{1,3})(?=)?\Z)/

I’ve got 3 occurences of (\d{1,3}.), followed by the same thing
without a dot. I’ve stipulated that this submatch be “looking at”
(i.e., positioned just before) an optional ‘)’ followed by the end of
the string. (\Z gives you end of string, ignoring a possible terminal
newline.)

With your line, it gives you:

irb(main):039:0> line[re] # re.match(line)[0], or whatever
=> “222.222.222.22”

David

David, you rock. I’ll give it a try. Those expressions make my head
hurt. But I’ve been taking in
Regular Expression Tutorial - Learn How to Use Regular Expressions. It seems to cover a
lot of foundation and application. Thanks again!

tonyd

On Sat, Mar 29, 2008 at 8:16 AM, Tony De [email protected] wrote:

sourceip = line.scan(/\b(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})\b/)
/((\d{1,3}.){3}\d{1,3})(?=)?\Z)/
=> “222.222.222.22”

David

David, you rock. I’ll give it a try. Those expressions make my head
hurt. But I’ve been taking in
Regular Expression Tutorial - Learn How to Use Regular Expressions. It seems to cover a
lot of foundation and application. Thanks again!

Another possibility (if I understood correctly): check for the
mandatory ‘)’ in the HELO part, followed by any character (could be
changed by the specific spaces), followed by the numbers and dots for
the IP:

a = “Received: from mmds-111-19-22-30.twm.ca.internet.net (HELO
?192.168.1.2?) (222.222.222.22)”
a.match(/).*?(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})/)[1]

gives: “222.222.222.22”

Jesus.

Jesús Gabriel y Galán wrote:

On Sat, Mar 29, 2008 at 8:16 AM, Tony De [email protected] wrote:

sourceip = line.scan(/\b(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})\b/)
/((\d{1,3}.){3}\d{1,3})(?=)?\Z)/
=> “222.222.222.22”

David

David, you rock. I’ll give it a try. Those expressions make my head
hurt. But I’ve been taking in
Regular Expression Tutorial - Learn How to Use Regular Expressions. It seems to cover a
lot of foundation and application. Thanks again!

Another possibility (if I understood correctly): check for the
mandatory ‘)’ in the HELO part, followed by any character (could be
changed by the specific spaces), followed by the numbers and dots for
the IP:

a = “Received: from mmds-111-19-22-30.twm.ca.internet.net (HELO
?192.168.1.2?) (222.222.222.22)”
a.match(/).*?(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})/)[1]

gives: “222.222.222.22”

Jesus.

Thanks Jesus,

I appraciate your imput as well. You guys have been a great deal of
help.

tonyd

On 29 Mar 2008, at 07:16, Tony De wrote:

Those expressions make my head
hurt. But I’ve been taking in
Regular Expression Tutorial - Learn How to Use Regular Expressions. It seems to
cover a
lot of foundation and application. Thanks again!

You may find Rubular helpful for concocting your regular expressions
and decoding other people’s.

Regards,
Andy S.