Has anyone ever see a call to String.scan (the “regular” form, not the
block form) just hang when processing a string?
If so, any insight into how that might happen?
Thanks,
Wes
Has anyone ever see a call to String.scan (the “regular” form, not the
block form) just hang when processing a string?
If so, any insight into how that might happen?
Thanks,
Wes
Wes G. wrote:
Has anyone ever see a call to String.scan (the “regular” form, not the
block form) just hang when processing a string?If so, any insight into how that might happen?
Thanks,
Wes
Got an example you can share that demonstrates the problem?
Maybe the string is huge? Maybe there’s a recursive lookup happening of
some kind? Dunno without more detail.
Regards,
Dan
From: “Daniel B.” [email protected]
Got an example you can share that demonstrates the problem?
Maybe the string is huge? Maybe there’s a recursive lookup happening of
some kind? Dunno without more detail.
I’m guessing exponential backtracking when the expression doesn’t match.
For ex:
x = “x” * 1000
x << “ab”
x =~ /xxa/ # matches quickly
x =~ /xxb/ # “hangs” trying exponential combinations with
backtracking
Regards,
Bill
From: “Robert K.” [email protected]
Yes, that’s a property of NFA based regular expression engines. Some
patterns have awful runtime characteristics because of the
backtracking. Without looking too close at your patter I guess in
this case it’s caused by having “x*” twice in your RX - not very
useful anyway.
Thanks We haven’t seen the OP’s regexp yet; I was just
offering the above as an example of what can cause the
hanging he described.
Regards,
Bill
2006/5/24, Bill K. [email protected]:
x << “ab”
x =~ /xxa/ # matches quickly
x =~ /xxb/ # “hangs” trying exponential combinations with backtracking
Yes, that’s a property of NFA based regular expression engines. Some
patterns have awful runtime characteristics because of the
backtracking. Without looking too close at your patter I guess in
this case it’s caused by having “x*” twice in your RX - not very
useful anyway. If you are interested in more detail I recommend
reading “Mastering Regular Expressions”:
Kind regards
robert
Just for kicks, here was the original regex - attempting to match
against a string containing an entire HTML page OR a style sheet.
/(?=<style)?background.?(?=-image)?:.?url((.?)).?(?=</style>)*?/
I was attempting to make the <style and parts optional.
As it turns out, there are so many different ways that a style can be
specified in a HTML page, that I just went with the much simpler:
/background.?:.?url((.*?))/i
Wes
Bill K. wrote:
From: “Robert K.” [email protected]
Yes, that’s a property of NFA based regular expression engines. Some
patterns have awful runtime characteristics because of the
backtracking. Without looking too close at your patter I guess in
this case it’s caused by having “x*” twice in your RX - not very
useful anyway.Thanks We haven’t seen the OP’s regexp yet; I was just
offering the above as an example of what can cause the
hanging he described.Regards,
Bill
All,
Thanks for giving me some idea of what might be wrong. For now, I’ve
gone with a simpler regex that I think will work fine. My original
pattern was attempting to allow several portions of the pattern to be
optional and the more that I think about it, the more it makes sense
that this might occur.
I will keep (?>pattern) in mind as a way to inhibit backtracking.
Thanks,
Wes
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.
Sponsor our Newsletter | Privacy Policy | Terms of Service | Remote Ruby Jobs