Regexp: Incorrect behaviour of (\d\s*){N,} and (\d\s*?){N,}

Hi all.

I (one of my users, actually) have found an interesting one. There
are an enormous number of regular expression issues on JIRA so I’m not
initially sure whether it’s known already or not.

Expected behaviour (this is coming from MRI 1.8.6):
irb(main):001:0> test = “| 12345678 | 123456789 |”
=> “| 12345678 | 123456789 |”
irb(main):002:0> test.gsub(/(\d\s*){9,}/, ‘XXX’)
=> “| 12345678 | XXX|”
irb(main):003:0> test.gsub(/(\d\s*?){9,}/, ‘XXX’)
=> “| 12345678 | XXX |”

Behaviour under JRuby 1.5.1:
irb(main):001:0> test = “| 12345678 | 123456789 |”
=> “| 12345678 | 123456789 |”
irb(main):002:0> test.gsub(/(\d\s*){9,}/, ‘XXX’)
=> “| XXX | XXX|”
irb(main):003:0> test.gsub(/(\d\s*?){9,}/, ‘XXX’)
=> “| XXX| XXX |”

The workaround I gave them:
irb(main):004:0> test.gsub(/\d(\s*\d){8,}/, ‘XXX’)
=> “| 12345678 | XXX |”

TX


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

On Jul 16, 2010, at 12:06 AM, Trejkaz wrote:

=> “| 12345678 | XXX|”

The workaround I gave them:
irb(main):004:0> test.gsub(/\d(\s*\d){8,}/, ‘XXX’)
=> “| 12345678 | XXX |”

TX

I looked through test cases in Joni, and there is none exercising this
case.

I opened http://jira.codehaus.org/browse/JRUBY-4942.


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email