Multiple Characters Negate Using Regexp

softwareengineer_99 · February 20, 2006, 12:54am

Dear experts,

I am trying to build a regular expression to filter out anything
between <script … > and tags where I can specify something
using negate class to exclude more than one character in sequence.

I tried:

originalresponse.gsub(/<script([^>]+)>([^<]+)?</script>/,’’)

but obviously if the script has the character < before then
my regexp breaks.

I also tried:
originalresponse.gsub(/<script([^>]+)>([^</script]+)?</script>/,’’)

but it doesn’t work

Can anyone guide me on how I can specify multiple character negation?
I will greatly appreciate it.

Is this even possible? How else can I remove everything contained
between tags?

Thanks
Frank

softwareengineer_99 · February 20, 2006, 1:16am

Is it possible to specify a multiple line regular expression?

Thanks for your assistance.
Frank

softwareengineer_99 · February 20, 2006, 4:06am

Can you restate your original problem a bit better? Are you wanting to
delete everything between the tags?

If so, I think this expression should work (famous last words

/(< script.?>).*?(</ *script *>)/

I tried it like this:

irb(main):001:0> “hello there”.gsub(/(<
script.?>).*?(</ *script *>)/,‘’)
=> “hello there”

Jeff

softwareengineer_99 · February 24, 2006, 1:37pm

softwareengineer 99 wrote:

Is it possible to specify a multiple line regular expression?

yes, just put ‘m’ after it, e.g.

%r{<textarea[^>]?id=“markup”[^>]>([^<]*)}m

or

/<textarea[^>]?id=“markup”[^>]>([^<]*)/m

regards

Justin

softwareengineer_99 · February 21, 2006, 8:14pm

I recommend /<script([^>]+)>.*?</script>/

Adding the ? after the * makes it un-greedy, so it will match the
first following tag, as in the following example.

irb(main):001:0> s = “I have a don’cha know :)”
=> “I have a don’cha know
:)”
irb(main):002:0> r = /<script([^>]+)>.?</script>/
=> /<script([^>]+)>.?</script>/
irb(main):003:0> s.gsub(r,’’)
=> “I have a don’cha know :)”

Jamie