what kind of pattern will match the part of sentence before a
tag?
for instance:
for this sentence:
This forum is connected to a mailing list that is read by thousands of people.
it’ll match:
This forum is connected to a mailing list that is read by
/^.*?(?=<span)/
This is a little loose since it treats anything starting with “<span”
as a span tag.
Breaking it down:
^ - start of string
.*? - 0 or more characters, non-greedy, otherwise this would match
everything up to the LAST “<span” in the string, in stead of the first
which is what I suspect you really want.
(?=<span) - This is a zero-length lookahead, this means that “<span”
must occur just after what has been matched, but it will not be part
of the match itself.
what kind of pattern will match the part of sentence before a
tag?
for instance:
for this sentence:
This forum is connected to a mailing list that is read by thousands of people.
it’ll match:
This forum is connected to a mailing list that is read by
One way to do it:
irb(main):022:0* s=‘This forum is connected to a mailing list that is
read by <span
irb(main):023:0’ class=“wow”>thousands of people.’
=> “This forum is connected to a mailing list that is read by
<span\nclass=“wow”>thousands of people.”
irb(main):024:0> s[/\A(.*?)<span/, 1]
=> "This forum is connected to a mailing list that is read by "
Using the old saying,
“If it walks like a duck and talks like a duck, then it is a duck.”
It means deciding something is a duck if it seems to be a duck.
Part of the principle of least surprise [to Matz]
(?=<span) - This is a zero-length lookahead, this means that “<span”
must occur just after what has been matched, but it will not be part
of the match itself.
so ?= makes pattern lookAHEAD. How to make pattern lookBEHIND?
for instance:
example sentence:
This forum is connected to a mailing list that is read by thousands of people.
question:
how to make a Regexp to match the words followed by the tag?
a /</span>.*/ will include the tag, which isn’t what I want.
Zero-width positive and negative lookaheads are supported in Ruby’s
regexp engine in 1.8. Zero-width lookbehind assertions are not
supported by the current regexp engine. (However, they are supported
by Oniguruma, the regexp engine used in 1.9 and future builds of
Ruby.)
example sentence:
This forum is connected to a mailing list that is read by thousands of people.
question:
how to make a Regexp to match the words followed by the tag?
Just because you consume them doesn’t mean you have to use them. Use
parentheses to saved parts of text extracted by your regular
expression.
irb(main):001:0> str = ‘is read by thousands
of people.’
=> “is read by <span class="wow">thousands of people.”
irb(main):002:0> str[ /</span>(.+)/, 1 ]
=> " of people."
irb(main):003:0> %r{(.+)}.match( str ).to_a
=> [“ of people.”, " of people."]
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.