I observed a inconsistent behavior between ruby (ruby 1.8.6 (2009-06-08
patchlevel 369) [x86_64-linux]), running in my host, and ruby in
www.rubular.com (reports “Rubular runs on Ruby 1.8.7.”). The expression:
/ (\w+(-\w+)?) . (log|txt) [.-]? (\d{1}) ( . (gz|bz2))? /x
in my host returns (among others):
net-snmpd.log-20110629.bz2
In rubular site, recognize (correctly, I suppose!) just until:
net-snmpd.log-2 (stop here, i.e. don’t recognize the string).
Thanks for any help! (Upgrade ruby to 1.9 isn’t an option!)
Camargo
NB: My code (just for test) is:
#! /usr/bin/ruby -w
require ‘find’
re_logdir = Regexp.new(//log$|/log/|/logs$|/logs//)
dias = 5
a=‘(\w+(-\w+)?)’
d=‘(\d{4}[.-]?\d{2}[.-]?\d{2})’
h=‘(\d{2}[.-]?\d{2}[.-]?\d{2,6})’
l=‘(log|txt)’
z=‘(gz|bz2)’
n1=‘(\d{1})’
s=‘[.-]’
re_logfile = Regexp.new(/ #{a} . #{l} #{s}? #{n1} ( . #{z})? /x )
p re_logfile
Find.find(‘/’) do |f|
if File.file?(f) and
re_logdir.match( File.dirname(f) ) and
re_logfile.match( File.basename(f) )
p f
end
end
On Fri, Jul 22, 2011 at 12:42 PM, Carlos C.
[email protected]wrote:
I observed a inconsistent behavior between ruby (ruby 1.8.6 (2009-06-08
patchlevel 369) [x86_64-linux]), running in my host, and ruby in
www.rubular.com (reports “Rubular runs on Ruby 1.8.7.”). The expression:
/ (\w+(-\w+)?) . (log|txt) [.-]? (\d{1}) ( . (gz|bz2))? /x
in my host returns (among others):
net-snmpd.log-20110629.bz2
I just matched “net-snmpd.log-20110629.bz2 stuff blah blah” against that
regex in 1.8.6-p420:
regex.match(string)[0] => “net-snmpd.log-2”
This seems correct to me. In the area of your regex: " (\d{1}) (.
(gz|bz1))? " you explicitly match exactly one digit (which could also be
accomplished by omitting the curly-brace quantifier.
If you want it to match all following digit (1 more more) you could use:
\d{1,} or more commonly \d+
In rubular site, recognize (correctly, I suppose!) just until:
net-snmpd.log-2 (stop here, i.e. don’t recognize the string).
It recognizes it. Your file extension is optional (? quantifier around
group). Again, this behavior matches my system’s ruby 1.8.6
implementation
as well as how it should behave, given your regex.
Thanks for any help! (Upgrade ruby to 1.9 isn’t an option!)
Camargo
How about: / (\w+(-\w+)?) . (log|txt) [.-]? (\d+) ( . (gz|bz2))? /x
a=‘(\w+(-\w+)?)’
d=‘(\d{4}[.-]?\d{2}[.-]?\d{2})’
h=‘(\d{2}[.-]?\d{2}[.-]?\d{2,6})’
l=‘(log|txt)’
z=‘(gz|bz2)’
n1=‘(\d{1})’
s=‘[.-]’
So change n1 to: ‘(\d+)’
Hello Camargo,
Ruby is right. The regexp matches the string.
$ ruby -e ‘p (/ (\w+(-\w+)?) . (log|txt) [.-]? (\d{1}) ( .
(gz|bz2))?
/x).match(“net-snmpd.log-20110629.bz2”)’
#<MatchData “net-snmpd.log-2” 1:“net-snmpd” 2:“-snmpd” 3:“log” 4:“2”
5:nil
6:nil>
If you need that the whole string should match, and not part of it, you
need
to end your regexp with “$”, like:
/ (\w+(-\w+)?) . (log|txt) [.-]? (\d{1}) ( . (gz|bz2))? $/x
Also, I recommend that you start your regexp with “^”
/^ (\w+(-\w+)?) . (log|txt) [.-]? (\d{1}) ( . (gz|bz2))? $/x
This one should do the job.
Regards,
Luiz Angelo Daros de Luca, Me.
[email protected]
2011/7/22 Carlos C. [email protected]
Sorry, it was my mistake! I’ve changed regex putting ^ and $; now, it
works like I expected.
Thanks, Kendall and De Luca.
Best regards,
Carlos C.