Ruby Regex - Unable to capture specific data

Hi all,

I have several strings of data, which are all very similar, however I
only wish to look at some strings which match a specific criteria and
ignore the rest. Some samples are below - I want the first and the last
string and to ignore the middle string.

/software/$$$PROTO.HIV/Microsoft/Windows NT/CurrentVersion

/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Wow6432Node/Microsoft/Office/12.0/User
Settings/Word_Core/Delete/Software/Microsoft/Windows NT/CurrentVersion

/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Microsoft/Windows
NT/CurrentVersion

I have constructed a regex to say capture any string starting with
/software/ and ending with /Microsoft/Windows NT/CurrentVersion, but
with only one string within slashes in the middle, see below:

/software/(.*?)/Microsoft/Windows NT/CurrentVersion$

This regex captures everything because (.*?) takes everything. Any ideas
how I can achieve this? My brain is frying.

Many thanks
S

On Wed, Dec 4, 2013 at 5:36 PM, Stuart C. [email protected]
wrote:

Settings/Word_Core/Delete/Software/Microsoft/Windows NT/CurrentVersion
This regex captures everything because (.*?) takes everything. Any ideas
how I can achieve this? My brain is frying.

Try this:

strings = [“/software/$$$PROTO.HIV/Microsoft/Windows
NT/CurrentVersion”,
“/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Wow6432Node/Microsoft/Office/12.0/User
Settings/Word_Core/Delete/Software/Microsoft/Windows
NT/CurrentVersion”,
“/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Microsoft/Windows
NT/CurrentVersion”]

re = //software/([^/]*)/Microsoft/Windows NT/CurrentVersion$/

2.0.0p195 :018 > strings.each do |s|
2.0.0p195 :019 > m = re.match(s)
2.0.0p195 :020?> puts m.captures if m
2.0.0p195 :021?> end
$$$PROTO.HIV
CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}

[^/] is a character class that will match everything that is not a
forward slash. This is repeated zero or more times.

Jesus.

On Dec 4, 2013, at 6:36 PM, Stuart C. [email protected] wrote:

/software/$$$PROTO.HIV/Microsoft/Windows NT/CurrentVersion

The scan method might be a better tool for this job. Then all you have
to do is specify the elements you want out of the resulting array.

text.scan(/[^/]+/)

HTH,
Ammar

Jesus - this still returns all strings for me in Ruby 1.9.

Anmar - I will try your suggestion. Was keen to keep a regex.

On Dec 4, 2013, at 11:50 AM, Jess Gabriel y Galn
[email protected] wrote:

/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Wow6432Node/Microsoft/Office/12.0/User

"/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Microsoft/Windows

[^/] is a character class that will match everything that is not a
forward slash. This is repeated zero or more times.

Jesus.

Remember that you can use %r to quote regular expressions if the
presence of the forward slash causes a lot of escaping. For example in

ratdog:mcqd mike$ pry
[1] pry(main)> re = %r(/software/([^/])/Microsoft/Windows
NT/CurrentVersion\z) => //software/([^/]
)/Microsoft/Windows
NT/CurrentVersion\z/

%r(/software/([^/]*)/Microsoft/Windows NT/CurrentVersion\z)

seems more reasonable than

//software/([^/]*)/Microsoft/Windows NT/CurrentVersion$/

I changed $ to \z for matching the end of string as well because it
matches the real end of string:

[3] pry(main)> /Hello$/.match “Hello\n”
=> #<MatchData “Hello”>
[4] pry(main)> /Hello\z/.match “Hello\n”
=> nil

Hope this helps,

Mike

Mike S. [email protected]
http://www.stok.ca/~mike/

The “`Stok’ disclaimers” apply.

On Wed, Dec 4, 2013 at 11:59 AM, Jesús Gabriel y Galán <

On Wed, Dec 4, 2013 at 6:08 PM, Stuart C. [email protected]
wrote:

Jesus - this still returns all strings for me in Ruby 1.9.

Works for me in 1.9 too:

1.9.3p448 :027 > re = //software/([^/])/Microsoft/Windows
NT/CurrentVersion$/
=> //software/([^/]
)/Microsoft/Windows NT/CurrentVersion$/
1.9.3p448 :028 > strings = [“/software/$$$PROTO.HIV/Microsoft/Windows
NT/CurrentVersion”,
“/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Wow6432Node/Microsoft/Office/12.0/UserSettings/Word_Core/Delete/Software/Microsoft/Windows
NT/CurrentVersion”,
“/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Microsoft/Windows
NT/CurrentVersion”]
=> [“/software/$$$PROTO.HIV/Microsoft/Windows NT/CurrentVersion”,
“/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Wow6432Node/Microsoft/Office/12.0/UserSettings/Word_Core/Delete/Software/Microsoft/Windows
NT/CurrentVersion”,
“/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Microsoft/Windows
NT/CurrentVersion”]
1.9.3p448 :029 > strings.each {|s| (m = re.match(s)) &&
puts(m.captures)}
$$$PROTO.HIV
CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}
=> [“/software/$$$PROTO.HIV/Microsoft/Windows NT/CurrentVersion”,
“/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Wow6432Node/Microsoft/Office/12.0/UserSettings/Word_Core/Delete/Software/Microsoft/Windows
NT/CurrentVersion”,
“/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Microsoft/Windows
NT/CurrentVersion”]

Anmar - I will try your suggestion. Was keen to keep a regex.

His solution is using a regex too :).

Jesus.

On Wed, Dec 4, 2013 at 6:10 PM, Mike S. [email protected] wrote:

%r(/software/([^/]*)/Microsoft/Windows NT/CurrentVersion\z)

That’s almost exactly what I’d do. It’s just lacking the anchor at
the beginning:

%r(\A/software/([^/]*)/Microsoft/Windows NT/CurrentVersion\z)

Kind regards

robert

Jesus,

Apologies - on reflection your sample is exactly what I need. I was a
little to hasty in my reply.

Thanks for the help and also thanks Mike for the tidy up tips.

S