Regexp to match strings that _don't_ being with a string


#1

I would like to write a regexp that will match a string that does NOT
start with a specified set of characters.

For example,

Given:

xyz123
asldfhsl
xyk2345

and assume that I want to see only strings that don’t start with “xyz”
(so in this case, the last 2 in the list).

I tried /^[^(xyz)]/ but I don’t trust it. I don’t think the grouping
will take inside the character class.

Do I need a negative lookahead assertion?

Thanks,
Wes


#2

On Wed, 29 Mar 2006 03:24:36 +0900, Wes G. removed_email_address@domain.invalid wrote:

I tried /^[^(xyz)]/ but I don’t trust it. I don’t think the grouping
will take inside the character class.

The [^(xyz)] creates a negative character class, so your regex would
match any string that started with a character not in the given set.
Not what you really want

Do I need a negative lookahead assertion?

That would be a simple solution: /^(?!xyz)/ which will match when
the beginning of the line/string is not followed by ‘xyz’.

andrew


#3

What about:

! s =~ /^xyz/

There is a good discussion of doing exactly this with negative
look-aheads in ‘man perlre’. It’s… ugly.


#4

In my example, won’t /^(?!xyz)/ also match

29384723xyz02342

which is a little more than I want?

WG

Andrew J. wrote:

On Wed, 29 Mar 2006 03:24:36 +0900, Wes G. removed_email_address@domain.invalid wrote:

I tried /^[^(xyz)]/ but I don’t trust it. I don’t think the grouping
will take inside the character class.

The [^(xyz)] creates a negative character class, so your regex would
match any string that started with a character not in the given set.
Not what you really want

Do I need a negative lookahead assertion?

That would be a simple solution: /^(?!xyz)/ which will match when
the beginning of the line/string is not followed by ‘xyz’.

andrew


#5

Hey that’s too easy!!!

Thanks James.

I’m liking Ruby more and more :).

Wes

James G. wrote:

On Mar 28, 2006, at 12:40 PM, Chris A. wrote:

! s =~ /^xyz/

Ruby has a doesn’t match operator for this:

s !~ /^xyz/

James Edward G. II


#6

On Mar 28, 2006, at 12:40 PM, Chris A. wrote:

! s =~ /^xyz/

Ruby has a doesn’t match operator for this:

s !~ /^xyz/

James Edward G. II


#7

On Wed, 29 Mar 2006 03:43:51 +0900, Wes G. removed_email_address@domain.invalid wrote:

In my example, won’t /^(?!xyz)/ also match

29384723xyz02342

which is a little more than I want?

Uhm, maybe I’ve misunderstood (wouldn’t be the first time) – I thought
you
wanted to match strings that did not begin with ‘xyz’ … and as far as
I
can tell, “29384723xyz02342” does not start with ‘xyz’.

while DATA.gets
print if ~/^(?!xyz)/
end
END
xyefoo
xyzpfoo
asdfsdf
1230xyzasdf

produces:

xyefoo
asdfsdf
1230xyzasdf

puzzled,
andrew


#8

Andrew,

That works fine.

In actuality, I do need the ability to pass one regex to do the job into
another utility that will use it to operate on an array of strings.

So although !~ is cool, I really didn’t want to have to iterate through
the strings myself.

Thanks,
Wes

Andrew J. wrote:

On Wed, 29 Mar 2006 03:43:51 +0900, Wes G. removed_email_address@domain.invalid wrote:

In my example, won’t /^(?!xyz)/ also match

29384723xyz02342

which is a little more than I want?

Uhm, maybe I’ve misunderstood (wouldn’t be the first time) – I thought
you
wanted to match strings that did not begin with ‘xyz’ … and as far as
I
can tell, “29384723xyz02342” does not start with ‘xyz’.

while DATA.gets
print if ~/^(?!xyz)/
end
END
xyefoo
xyzpfoo
asdfsdf
1230xyzasdf

produces:

xyefoo
asdfsdf
1230xyzasdf

puzzled,
andrew


#9

Wes G. wrote:

On Mar 28, 2006, at 12:40 PM, Chris A. wrote:

! s =~ /^xyz/

Ruby has a doesn’t match operator for this:

s !~ /^xyz/

James Edward G. II

Awesome.

-Justin


#10

I have a new wrinkle.

Now I want to match any line that doesn’t have “xyz” or “abc” at the
beginning of the line.

Is there a way to “AND” together the input to the negative lookahead
assertion?

Wes

Wes G. wrote:

Andrew,

That works fine.

In actuality, I do need the ability to pass one regex to do the job into
another utility that will use it to operate on an array of strings.

So although !~ is cool, I really didn’t want to have to iterate through
the strings myself.

Thanks,
Wes

Andrew J. wrote:

On Wed, 29 Mar 2006 03:43:51 +0900, Wes G. removed_email_address@domain.invalid wrote:

In my example, won’t /^(?!xyz)/ also match

29384723xyz02342

which is a little more than I want?

Uhm, maybe I’ve misunderstood (wouldn’t be the first time) – I thought
you
wanted to match strings that did not begin with ‘xyz’ … and as far as
I
can tell, “29384723xyz02342” does not start with ‘xyz’.

while DATA.gets
print if ~/^(?!xyz)/
end
END
xyefoo
xyzpfoo
asdfsdf
1230xyzasdf

produces:

xyefoo
asdfsdf
1230xyzasdf

puzzled,
andrew


#11

Wes G. wrote:

I have a new wrinkle.

Now I want to match any line that doesn’t have “xyz” or “abc” at the
beginning of the line.

Is there a way to “AND” together the input to the negative lookahead
assertion?

For lookaheads, you can get AND by concatenating:

irb(main):001:0> /^(?!abc)(?!xyz)/ =~ “abc”
=> nil
irb(main):002:0> /^(?!abc)(?!xyz)/ =~ " abc"
=> 0
irb(main):003:0> /^(?!abc)(?!xyz)/ =~ “xyz”
=> nil
irb(main):004:0> /^(?!abc)(?!xyz)/ =~ " xyz"
=> 0


#12

Wes G. removed_email_address@domain.invalid writes:

Thanks,
Wes

I always wanted to propose a ~ operator for Regexps. :wink:
The current behavior is next to useless.

It wouldn’t be difficult to add… also, see lib/eregex.rb.


#13

Thanks, that makes sense since the lookahead doesn’t “consume” right?

WG
Joel VanderWerf wrote:

Wes G. wrote:

I have a new wrinkle.

Now I want to match any line that doesn’t have “xyz” or “abc” at the
beginning of the line.

Is there a way to “AND” together the input to the negative lookahead
assertion?

For lookaheads, you can get AND by concatenating:

irb(main):001:0> /^(?!abc)(?!xyz)/ =~ “abc”
=> nil
irb(main):002:0> /^(?!abc)(?!xyz)/ =~ " abc"
=> 0
irb(main):003:0> /^(?!abc)(?!xyz)/ =~ “xyz”
=> nil
irb(main):004:0> /^(?!abc)(?!xyz)/ =~ " xyz"
=> 0


#14

Wes G. wrote:

Thanks, that makes sense since the lookahead doesn’t “consume” right?

Right. In this case an alternation works, too:

p %w{abcd xyz as}.map {|s| /^(?!abc|xyz)/=~s}
=> [nil, nil, 0]

De Morgan’s Law comes to mind: not a and not b <=> not(a or b) :slight_smile:

Kind regards

robert