Question about regular expression


#1

I need to hack out an regular expression, which will match “SNPB”
without
matching “STR SNPB”.
Since ruby 1.8.2 or 1.8.4 don’t support the lookbehind feature. what’s
the
workable regular expression
for ruby version 1.8.2?

Thanks for any idea.

Eric


#2

On Jan 16, 2006, at 8:35 PM, Eric L. wrote:

I need to hack out an regular expression, which will match “SNPB”
without
matching “STR SNPB”.
Since ruby 1.8.2 or 1.8.4 don’t support the lookbehind feature.
what’s the
workable regular expression
for ruby version 1.8.2?

Lookbehind is just lookahead, backwards. :wink:

tests = %w{SNPB STR\ SNPB}
=> [“SNPB”, “STR SNPB”]

tests.map { |t| t.reverse }.grep(/\bBPNS\b(?! RTS)/).map { |t|
t.reverse }
=> [“SNPB”]

Hope that helps.

James Edward G. II


#3

Thanks for your replay.

To make the problem clear.
I want to only change the regular expression, but not the code to
implement
this function.
Actually, the regular expression comes from a configurable table. I’ll
be
supposed to only have the privilege to update the table.

So What I really want is a alternative way to in place of the regular
expression
(?<!STR )SNPB

2006/1/17, James Edward G. II removed_email_address@domain.invalid:


#4

On 1/17/06, Eric L. removed_email_address@domain.invalid wrote:

I need to hack out an regular expression, which will match “SNPB” without
matching “STR SNPB”.
Since ruby 1.8.2 or 1.8.4 don’t support the lookbehind feature. what’s the
workable regular expression
for ruby version 1.8.2?

tests = %w{SNPB STR\ SNPB}
re = /(?<!STR )SNPB/
tests.map{|t| t.match(re) }

#=> [#MatchData:0x65704, nil]

ruby 1.8.2 (2004-11-03) [powerpc-darwin7.5.0]


#5

Simon S. wrote:

#=> [#MatchData:0x65704, nil]

ruby 1.8.2 (2004-11-03) [powerpc-darwin7.5.0]


Simon S.

Simon, do you have oniguruma installed? This is what I get:

$ ruby -v -e ‘/(?<!STR )SNPB/’
ruby 1.8.4 (2005-12-24) [i686-linux]
-e:1: undefined (?..) sequence: /(?<!STR )SNPB/
-e:1: warning: useless use of a literal in void context


#6

On Tue, 17 Jan 2006, Eric L. wrote:

(? cat a.rb
strings = “STR SNPB”, “STR”, “SNPB”

re = %r/^ (?: (?:[^S]) | (?:S[^T]) | (?:ST[^R]) )* SNPB /ox

strings.each do |string|
permutations = string, “foo #{ string }”, “#{ string } bar”, “foo
#{ string } bar”
permutations.each do |permutation|
puts “<#{ permutation }> matches” if re.match permutation
end
end

harp:~ > ruby a.rb
matches
matches
matches
matches

hth.

-a


#7

On 1/17/06, Joel VanderWerf removed_email_address@domain.invalid wrote:

Simon S. wrote:
[snip]

#=> [#MatchData:0x65704, nil]

ruby 1.8.2 (2004-11-03) [powerpc-darwin7.5.0]

Simon, do you have oniguruma installed? This is what I get:
[snip]

hmm… it seem so. Sorry.


#8

Thanks very much, I really appreatiate your help

It does work, but I couldn’t figure out what the ^ and * do? could you
explain that in more detail?

Thanks

2006/1/17, removed_email_address@domain.invalid removed_email_address@domain.invalid:


#9

On 19/01/06, Eric L. removed_email_address@domain.invalid wrote:

In your solution:

matched.

But "STR bla SNPB " will also be matched, which is not expected.

I really appreciate your help. Thanks

Then the lookbehind assertion would not have worked either. (?<!A)B
says "Match any B that is not immediately preceded by an A.

Regards,
Stefan


#10

In your solution:

matched.

But "STR bla SNPB " will also be matched, which is not expected.

I really appreciate your help. Thanks

2006/1/17, removed_email_address@domain.invalid removed_email_address@domain.invalid:


#11

Sorry for my mistake.

I do want to let “STR bla SNPB” matched. but I said it mistakenly in the
opposite in my lastest post.

I want that the appearance of the word rather string STR just before a
SNPB
will cause the match to fail.

That is, “STR bla SNPB”, “PreSTR SNPB” “STRSNPB” will be matched.
but “STR SNPB” will not.

2006/1/20, removed_email_address@domain.invalid removed_email_address@domain.invalid:


#12

On Thu, 19 Jan 2006, Eric L. wrote:

In your solution:

matched.

But "STR bla SNPB " will also be matched, which is not expected.

it should not:

irb(main):002:0> %r/^ (?: (?:[^S]) | (?:S[^T]) | (?:ST[^R]) )* SNPB
/ox.match "STR bla SNPB "
=> nil

and does not on any of my machines. are you seeing something different?

the way that regular expression reads is:

  • beginning of line

  • zero or more things that are either

    • not an S
    • an S followed by not a T
    • or ST followed by not an R
  • the string SNPB

(we ignore the remainder of the string, though you could do more here
if
needed)

so the appearance of the string STR anywhere before a SNPB will
cause the
match to fail.

hth.

-a