I’m trying to emulate something I’ve done in .Net many moons ago, which
is capture a named group, but not just once, get all it’s repetitions
and then be able to see all those repetitions. I think they call them
GroupCollections in C#. This is the kind of code I’m trying to emulate
with Ruby(1.9.1):
using System;
using System.Text.RegularExpressions;
public class Test
{
public static void Main ()
{
// Define a regular expression for repeated words.
Regex rx = new Regex(@"\b(?<word>\w+)\s+(\k<word>)\b",
RegexOptions.Compiled | RegexOptions.IgnoreCase);
// Define a test string.
string text = "The the quick brown fox fox jumped over the lazy
dog dog.";
// Find matches.
MatchCollection matches = rx.Matches(text);
// Report the number of matches found.
Console.WriteLine("{0} matches found in:\n {1}",
matches.Count,
text);
// Report on each match.
foreach (Match match in matches)
{
GroupCollection groups = match.Groups;
Console.WriteLine("'{0}' repeated at positions {1} and {2}",
groups["word"].Value,
groups[0].Index,
groups[1].Index);
}
}
}
// The example produces the following output to the console:
// 3 matches found in:
// The the quick brown fox fox jumped over the lazy dog dog.
// ‘The’ repeated at positions 0 and 4
// ‘fox’ repeated at positions 20 and 25
// ‘dog’ repeated at positions 50 and 54
For example, if I had the string “11 12” I could have a regex like
/
(? \d+ ) \s \g
/x
that captured “11” and then the repetition “12” and put them in an
array (or some kind of collection) referenced by the name.
I think my attempts to get this to work are better explanations. What I
want is the result
#<MatchData “11 12” first:[“11”, “12”]> or something like it. At the
moment all my attempts end with the named capture only keeping the last
match it made i.e. 12 with no mention of 11.
I know I could do this a different way, perhaps with split or something,
but I’d like to know if it’s possible with just regex. I understand the
Oniguruma engine is used now but I can’t find any good docs for it.
These are my attempts, $ is my prompt.
$ md1 = /
(? \d+ )
\s \g
/x.match( “11 12” )
#<MatchData “11 12” first:“12”>
$ md1[:first]
“12”
$ md1 = /
(? \d+ )
(?: \s \g )?
/x.match( “11 12” )
#<MatchData “11 12” first:“12”>
$ md1[:first]
“12”
$ md1 = /
(? \d+ )
(?: \s
(? \g )
)?
/x.match( “11 12” )
#<MatchData “11 12” first:“12” second:“12”>
$ md1[:first]
“12”
$ md1[:second]
“12”
$ md1 = /
(?: (? \d+ )\s* )+
/x.match( “11 12” )
#<MatchData “11 12” first:“12”>
$ md1[:first]
“12”
Iain