Regexp -how to match this?

xain · April 9, 2007, 3:28pm

what kind of pattern will match the part of sentence before a
tag?

for instance:
for this sentence:
This forum is connected to a mailing list that is read by thousands of people.

it’ll match:
This forum is connected to a mailing list that is read by

xain · April 9, 2007, 4:13pm

On 4/9/07, Nanyang Z. [email protected] wrote:

what kind of pattern will match the part of sentence before a
tag?

for instance:
for this sentence:
This forum is connected to a mailing list that is read by thousands of people.

it’ll match:
This forum is connected to a mailing list that is read by

/^.*?(?=<span)/

This is a little loose since it treats anything starting with “<span”
as a span tag.

Breaking it down:

^ - start of string

.*? - 0 or more characters, non-greedy, otherwise this would match
everything up to the LAST “<span” in the string, in stead of the first
which is what I suspect you really want.

(?=<span) - This is a zero-length lookahead, this means that “<span”
must occur just after what has been matched, but it will not be part
of the match itself.

HTH

–
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

xain · April 9, 2007, 4:26pm

On 09.04.2007 15:28, Nanyang Z. wrote:

what kind of pattern will match the part of sentence before a
tag?

for instance:
for this sentence:
This forum is connected to a mailing list that is read by thousands of people.

it’ll match:
This forum is connected to a mailing list that is read by

One way to do it:

irb(main):022:0* s=‘This forum is connected to a mailing list that is
read by <span
irb(main):023:0’ class=“wow”>thousands of people.’
=> “This forum is connected to a mailing list that is read by
<span\nclass=“wow”>thousands of people.”
irb(main):024:0> s[/\A(.*?)<span/, 1]
=> "This forum is connected to a mailing list that is read by "

robert

xain · April 9, 2007, 4:37pm

Rick Denatale wrote:

/^.*?(?=<span)/

thanks.

BTW, what is â€œDuck Typingâ€?

xain · April 9, 2007, 5:09pm

Using the old saying,
“If it walks like a duck and talks like a duck, then it is a duck.”
It means deciding something is a duck if it seems to be a duck.
Part of the principle of least surprise [to Matz]

xain · April 9, 2007, 5:02pm

On 4/9/07, Nanyang Z. [email protected] wrote:

Rick Denatale wrote:

/^.*?(?=<span)/

thanks.

BTW, what is “Duck Typing”?

Well, here’s some of what I’ve written on the subject:
http://talklikeaduck.denhaven2.com/articles/tag/ducks

I’d suggest looking at them starting with the oldest one (they are in
reverse chronological order).

–
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

xain · April 9, 2007, 6:04pm

Nanyang Z. wrote:

BTW, what is â€œDuck Typingâ€?

PickAxe 2nd Edition (and probably the freely available 1st Edition) have
a nice, interesting and very readable chapter covering that.

In a nutshell: What the other’s have already said.

–
Phillip “CynicalRyan” Gawlowski
http://cynicalryan.110mb.com/

Rule of Open-Source Programming #6:

The user is always right unless proven otherwise by the developer.

xain · April 10, 2007, 3:00pm

Rick Denatale wrote:

(?=<span) - This is a zero-length lookahead, this means that “<span”
must occur just after what has been matched, but it will not be part
of the match itself.

so ?= makes pattern lookAHEAD. How to make pattern lookBEHIND?

for instance:

example sentence:
This forum is connected to a mailing list that is read by thousands of people.

question:
how to make a Regexp to match the words followed by the tag?

a /</span>.*/ will include the tag, which isn’t what I want.

xain · April 10, 2007, 4:39pm

Gavin K. wrote:

Just because you consume them doesn’t mean you have to use them. Use
parentheses to saved parts of text extracted by your regular
expression.

I’m trying to code one method(with one regexp input) to extract any part
of a given string.

but now it seems a fix method is very hard to accomplish this job.

xain · April 10, 2007, 3:51pm

On Apr 10, 7:00 am, Nanyang Z. [email protected] wrote:

so ?= makes pattern lookAHEAD. How to make pattern lookBEHIND?

http://phrogz.net/ProgrammingRuby/language.html#extensions

Zero-width positive and negative lookaheads are supported in Ruby’s
regexp engine in 1.8. Zero-width lookbehind assertions are not
supported by the current regexp engine. (However, they are supported
by Oniguruma, the regexp engine used in 1.9 and future builds of
Ruby.)

example sentence:
This forum is connected to a mailing list that is read by thousands of people.

question:
how to make a Regexp to match the words followed by the tag?

Just because you consume them doesn’t mean you have to use them. Use
parentheses to saved parts of text extracted by your regular
expression.

irb(main):001:0> str = ‘is read by thousands
of people.’
=> “is read by <span class="wow">thousands of people.”

irb(main):002:0> str[ /</span>(.+)/, 1 ]
=> " of people."

irb(main):003:0> %r{(.+)}.match( str ).to_a
=> [“ of people.”, " of people."]