Hi all,
I have several strings of data, which are all very similar, however I
only wish to look at some strings which match a specific criteria and
ignore the rest. Some samples are below - I want the first and the last
string and to ignore the middle string.
/software/$$$PROTO.HIV/Microsoft/Windows NT/CurrentVersion
/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Wow6432Node/Microsoft/Office/12.0/User
Settings/Word_Core/Delete/Software/Microsoft/Windows NT/CurrentVersion
/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Microsoft/Windows
NT/CurrentVersion
I have constructed a regex to say capture any string starting with
/software/ and ending with /Microsoft/Windows NT/CurrentVersion, but
with only one string within slashes in the middle, see below:
/software/(.*?)/Microsoft/Windows NT/CurrentVersion$
This regex captures everything because (.*?) takes everything. Any ideas
how I can achieve this? My brain is frying.
Many thanks
S
sclarke
December 4, 2013, 5:51pm
#2
On Wed, Dec 4, 2013 at 5:36 PM, Stuart C. [email protected]
wrote:
Settings/Word_Core/Delete/Software/Microsoft/Windows NT/CurrentVersion
This regex captures everything because (.*?) takes everything. Any ideas
how I can achieve this? My brain is frying.
Try this:
strings = ["/software/$$$PROTO.HIV/Microsoft/Windows
NT/CurrentVersion",
“/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Wow6432Node/Microsoft/Office/12.0/User
Settings/Word_Core/Delete/Software/Microsoft/Windows
NT/CurrentVersion”,
“/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Microsoft/Windows
NT/CurrentVersion”]
re = //software/([^/]*)/Microsoft/Windows NT/CurrentVersion$/
2.0.0p195 :018 > strings.each do |s|
2.0.0p195 :019 > m = re.match(s)
2.0.0p195 :020?> puts m.captures if m
2.0.0p195 :021?> end
$$$PROTO.HIV
CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}
[^/] is a character class that will match everything that is not a
forward slash. This is repeated zero or more times.
Jesus.
sclarke
December 4, 2013, 5:53pm
#3
On Dec 4, 2013, at 6:36 PM, Stuart C. [email protected] wrote:
/software/$$$PROTO.HIV/Microsoft/Windows NT/CurrentVersion
The scan method might be a better tool for this job. Then all you have
to do is specify the elements you want out of the resulting array.
text.scan(/[^/]+/)
HTH,
Ammar
sclarke
December 4, 2013, 6:08pm
#4
Jesus - this still returns all strings for me in Ruby 1.9.
Anmar - I will try your suggestion. Was keen to keep a regex.
sclarke
December 4, 2013, 6:11pm
#5
On Dec 4, 2013, at 11:50 AM, Jess Gabriel y Galn
[email protected] wrote:
/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Wow6432Node/Microsoft/Office/12.0/User
"/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Microsoft/Windows
[^/] is a character class that will match everything that is not a
forward slash. This is repeated zero or more times.
Jesus.
Remember that you can use %r to quote regular expressions if the
presence of the forward slash causes a lot of escaping. For example in
ratdog:mcqd mike$ pry
[1] pry(main)> re = %r(/software/([^/])/Microsoft/Windows
NT/CurrentVersion\z) => //software/([^/] )/Microsoft/Windows
NT/CurrentVersion\z/
%r(/software/([^/]*)/Microsoft/Windows NT/CurrentVersion\z)
seems more reasonable than
//software/([^/]*)/Microsoft/Windows NT/CurrentVersion$/
I changed $ to \z for matching the end of string as well because it
matches the real end of string:
[3] pry(main)> /Hello$/.match “Hello\n”
=> #<MatchData “Hello”>
[4] pry(main)> /Hello\z/.match “Hello\n”
=> nil
Hope this helps,
Mike
–
Mike S. [email protected]
http://www.stok.ca/~mike/
The “`Stok’ disclaimers” apply.
sclarke
December 5, 2013, 1:19am
#6
http://rubular.com/r/dekC1KOiBE
On Wed, Dec 4, 2013 at 11:59 AM, Jesús Gabriel y Galán <
sclarke
December 4, 2013, 6:59pm
#7
On Wed, Dec 4, 2013 at 6:08 PM, Stuart C. [email protected]
wrote:
Jesus - this still returns all strings for me in Ruby 1.9.
Works for me in 1.9 too:
1.9.3p448 :027 > re = //software/([^/])/Microsoft/Windows
NT/CurrentVersion$/
=> //software/([^/] )/Microsoft/Windows NT/CurrentVersion$/
1.9.3p448 :028 > strings = ["/software/$$$PROTO.HIV/Microsoft/Windows
NT/CurrentVersion",
“/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Wow6432Node/Microsoft/Office/12.0/UserSettings/Word_Core/Delete/Software/Microsoft/Windows
NT/CurrentVersion”,
“/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Microsoft/Windows
NT/CurrentVersion”]
=> ["/software/$$$PROTO.HIV/Microsoft/Windows NT/CurrentVersion",
“/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Wow6432Node/Microsoft/Office/12.0/UserSettings/Word_Core/Delete/Software/Microsoft/Windows
NT/CurrentVersion”,
“/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Microsoft/Windows
NT/CurrentVersion”]
1.9.3p448 :029 > strings.each {|s| (m = re.match(s)) &&
puts(m.captures)}
$$$PROTO.HIV
CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}
=> ["/software/$$$PROTO.HIV/Microsoft/Windows NT/CurrentVersion",
“/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Wow6432Node/Microsoft/Office/12.0/UserSettings/Word_Core/Delete/Software/Microsoft/Windows
NT/CurrentVersion”,
“/software/CMI-CreateHive{199ADFC2-6E16-4946-BE90-5A3EC3A60902}/Microsoft/Windows
NT/CurrentVersion”]
Anmar - I will try your suggestion. Was keen to keep a regex.
His solution is using a regex too :).
Jesus.
sclarke
December 5, 2013, 10:08am
#8
On Wed, Dec 4, 2013 at 6:10 PM, Mike S. [email protected] wrote:
%r(/software/([^/]*)/Microsoft/Windows NT/CurrentVersion\z)
That’s almost exactly what I’d do. It’s just lacking the anchor at
the beginning:
%r(\A/software/([^/]*)/Microsoft/Windows NT/CurrentVersion\z)
Kind regards
robert
sclarke
December 6, 2013, 7:00am
#9
Jesus,
Apologies - on reflection your sample is exactly what I need. I was a
little to hasty in my reply.
Thanks for the help and also thanks Mike for the tidy up tips.
S