Question about text parsing

Hello,

I am new to ruby and had a question about parsing text.

I am trying to parse out the first IP address from a string. I have a
working solution, but it seems like a rather round about way to do it.
This is my current method:

str = " inet addr:192.168.1.118 Bcast:192.168.1.255
Mask:255.255.255.0"
ipAddr = ((str.split(’:’))[1].split)[0]

Is there a better way to do this? Thanks in advance.

On Wednesday 22 July 2009 14:58:47 Stephen Beard wrote:

ipAddr = ((str.split(’:’))[1].split)[0]

Is there a better way to do this? Thanks in advance.

regexp:
str.scan(/addr:([^\s]+)/)[0][0]

string math:
str.slice(str.index(’:’)+1, str.index(’ B’)-str.index(’:’)-2)

Thanks for the quick and thorough replies. They all look great.

I am so terrible with regular expressions. Those links look like a good
place to go to improve, thanks Rob.

On Jul 22, 2009, at 6:18 PM, spox wrote:

str = " inet addr:192.168.1.118 Bcast:192.168.1.255

irb> str = " inet addr:192.168.1.118 Bcast:192.168.1.255
Mask:255.255.255.0"
=> " inet addr:192.168.1.118 Bcast:192.168.1.255 Mask:
255.255.255.0"
irb> re = /\b\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}\b/
=> /\b\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}\b/
irb> str.scan(re)
=> [“192.168.1.118”, “192.168.1.255”, “255.255.255.0”]

So the first one is just:
str.scan(re).first
or
str.scan(re)[0]

If you want to be tighter about matching valid IP addresses, look at
the alternate regexps about halfway down this page:
http://www.regular-expressions.info/examples.html

or the grand-daddy one with explanation of all the details:
http://www.regular-expressions.info/regexbuddy/ipaccuratecapture.html

If you include the capturing groups, the String#scan keeps them:

irb> re1 = /\b(?:\d{1,3}.){3}\d{1,3}\b/
=> /\b(?:\d{1,3}.){3}\d{1,3}\b/
irb> re2 = /\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b/x
=> /\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b/x
irb> re3 = /\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b/x
=> /\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b/x
irb> [re, re1, re2, re3].each do |r|
?> puts ‘-’*30
irb> puts r
irb> p str.scan(r)
irb> end; nil

(?-mix:\b\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}\b)
[“192.168.1.118”, “192.168.1.255”, “255.255.255.0”]

(?-mix:\b(?:\d{1,3}.){3}\d{1,3}\b)
[“192.168.1.118”, “192.168.1.255”, “255.255.255.0”]

(?x-mi:\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b)
[[“192”, “168”, “1”, “118”], [“192”, “168”, “1”, “255”], [“255”,
“255”, “255”, “0”]]

(?x-mi:\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b)
[“192.168.1.118”, “192.168.1.255”, “255.255.255.0”]
=> nil

-Rob

Rob B. http://agileconsultingllc.com
[email protected]

Hi –

On Thu, 23 Jul 2009, 7stud – wrote:

Stephen Beard wrote:

Thanks for the quick and thorough replies. They all look great.

I am so terrible with regular expressions. Those links look like a good
place to go to improve, thanks Rob.

Always look to a string method first. split() rules the world, and
you’ve made good use of it.

String method isn’t the opposite of regular expression, though. It’s
important to understand regexes to use split (as well as scan and
(g)sub) effectively.

David

Stephen Beard wrote:

Thanks for the quick and thorough replies. They all look great.

I am so terrible with regular expressions. Those links look like a good
place to go to improve, thanks Rob.

Always look to a string method first. split() rules the world, and
you’ve made good use of it.

Hi –

On Thu, 23 Jul 2009, spox wrote:

Mask:255.255.255.0"
ipAddr = ((str.split(’:’))[1].split)[0]

Is there a better way to do this? Thanks in advance.

regexp:
str.scan(/addr:([^\s]+)/)[0][0]

string math:
str.slice(str.index(’:’)+1, str.index(’ B’)-str.index(’:’)-2)

Here’s a technique involving subscripting a string with a regular
expressions. The number 1 at the end causes the whole thing to return
the contents of the first parenthetical capture (which is all
consecutive non-space characters following the first colon).

str[/:(\S+)/,1]

David

Perfect.

Thanks.
Wesley C…