Ruby regex match hex string

luislavena · March 31, 2012, 8:58pm

Hey guys,

I was wondering if someone could help me out.
I’m trying to validate a string with a regex but can’t get it working.
I need to validate a string containing hex pairs divided by a comma,
like this

11,1,aa,a,1b,3b,55,b6

I got this regex:

/\A([0-9a-fA-F]{1,2},{1})*([0-9a-fA-F]{1,2}){1}\Z/

but it only validates the last two pairs, 55, and b6

It should’t validate strings like this:

111 or strings like aaa,aaa,bob,ccc

Can anybody help me out with some pointers?
I’m new to using regex expressions.

Thanks a lot!

niels_ruby · March 31, 2012, 9:31pm

Hi,

What makes you think that only the last two pairs are matched? This is
impossible, because you’re using the \A and \Z anchors. Either the whole
string matches or it doesn’t match at all.

The regex is actually correct and it does match the example string.

However, you can leave out the {1} quantifier, because it has no effect
(it means repeating the pattern exactly once). And you can make the
regex shorter by using the “i” modifier, which makes it case
insensitive:

/\A(?:[0-9a-f]{1,2},)*[0-9a-f]{1,2}\z/i

niels_ruby · March 31, 2012, 9:36pm

I hope this help

puts “11,1,aa,a,1b,3b,55,b6,6l,mm”.scan(/([0-9a-fA-F][0-9a-fA-F])/)

any better idea??

niels_ruby · March 31, 2012, 10:10pm

On Sat, Mar 31, 2012 at 9:58 PM, Niels S. [email protected]
wrote:

/\A([0-9a-fA-F]{1,2},{1})*([0-9a-fA-F]{1,2}){1}\Z/

You can use \h to match a hexadecimal digit as well, which match both
lower and upper case a-f.

/\A(\h{1,2},)*\h{1,2}\z/

Regards,
Ammar

niels_ruby · March 31, 2012, 10:16pm

On Sat, Mar 31, 2012 at 8:31 PM, Jan E. [email protected] wrote:

Hi,

What makes you think that only the last two pairs are matched? This is
impossible, because you’re using the \A and \Z anchors. Either the whole
string matches or it doesn’t match at all.

The regex is actually correct and it does match the example string.

The confusion is that only two capturing groups are returned
(while the first part between brackets matches many times).

1.9.3p125 :001 > s = “11,1,aa,a,1b,3b,55,b6”
=> “11,1,aa,a,1b,3b,55,b6”
1.9.3p125 :002 >
s.match(/\A([0-9a-fA-F]{1,2},{1})([0-9a-fA-F]{1,2}){1}\Z/)
=> #<MatchData “11,1,aa,a,1b,3b,55,b6” 1:“55,” 2:“b6”>
1.9.3p125 :003 > s.match(/\A([0-9a-fA-F]{1,2},)([0-9a-fA-F]{1,2})\Z/)
=> #<MatchData “11,1,aa,a,1b,3b,55,b6” 1:“55,” 2:“b6”>

And as mentioned by gabe, maybe scan is what the OP was after;

1.9.3p125 :004 > s.scan(/[0-9a-f]{1,2}/i).inspect
=> “["11", "1", "aa", "a", "1b", "3b", "55", "b6"]”

1.9.3p125 :005 > s.scan(/([0-9a-f]{1,2}),/i).inspect # more selective
on the comma
=> “[["11"], ["1"], ["aa"], ["a"], ["1b"], ["3b"],
["55"]]”

HTH,

Peter

niels_ruby · April 2, 2012, 9:40am

Ammar A. wrote in post #1054416:

On Sat, Mar 31, 2012 at 9:58 PM, Niels S. [email protected]
wrote:

/\A([0-9a-fA-F]{1,2},{1})*([0-9a-fA-F]{1,2}){1}\Z/

You can use \h to match a hexadecimal digit as well, which match both
lower and upper case a-f.

/\A(\h{1,2},)*\h{1,2}\z/

Just a tiny remark: this one should be more efficient

/\A\h{1,2}(?:,\h{1,2})*\z/

because it avoids backtracking of the NFA and does not use a capturing
group.

Kind regards

robert

niels_ruby · April 1, 2012, 2:46pm

Note that hexadecimal digits can be matched by a POSIX character class,
or in 1.9 there is a special escape for them:

/\A(?:[[:xdigit:]]{1,2},)*[[:xdigit:]]{1,2}\z/ # 1.8, 1.9

or

/\A(?:\h{1,2},)*\h{1,2}\z/ # 1.9

would match a valid string.

Mike

On 2012-03-31, at 4:15 PM, Peter V. wrote:

(while the first part between brackets matches many times).
1.9.3p125 :004 > s.scan(/[0-9a-f]{1,2}/i).inspect
=> “["11", "1", "aa", "a", "1b", "3b", "55", "b6"]”

1.9.3p125 :005 > s.scan(/([0-9a-f]{1,2}),/i).inspect # more selective
on the comma
=> “[["11"], ["1"], ["aa"], ["a"], ["1b"], ["3b"], ["55"]]”

HTH,

Peter

–

Mike S. [email protected]
http://www.stok.ca/~mike/

The “`Stok’ disclaimers” apply.