Forum: Ruby bug is ruby regexp

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Nick B. (Guest)
on 2007-02-02 17:55
(Received via mailing list)
Hello,

I spotted this problem in ruby's regexp today:

$ irb(main):001:0> num = "10"
=> "10"
irb(main):002:0> if num =~ /[9-13]/
irb(main):003:1> puts "hello"
irb(main):004:1> end
SyntaxError: compile error
(irb):2: invalid regular expression: /[9-13]/
        from (irb):4
        from :0
irb(main):005:0>

I have tested it in ruby 1.8 and 0.9.

Anyone else spotted this?
James G. (Guest)
on 2007-02-02 18:04
(Received via mailing list)
On Feb 2, 2007, at 9:54 AM, Nick B. wrote:

> (irb):2: invalid regular expression: /[9-13]/
>        from (irb):4
>        from :0
> irb(main):005:0>
>
> I have tested it in ruby 1.8 and 0.9.
>
> Anyone else spotted this?

A character class ([...]) with a range of 9-1 is not valid in a
regular expression because 1 does not come after 9 in your character
encoding.

I believe you were trying to verify that num is between 9 and 13.
Your regex would not do this even if it was legal.  Character classes
give multiple choices for a single character, not a group of characters.

Here are some ways to perform your check:

 >> num = "10"
=> "10"
 >> num =~ /\A(?:9|1[0123])\Z/
=> 0
 >> num.to_i.between? 9, 13
=> true

Hope that helps.

James Edward G. II
Rob B. (Guest)
on 2007-02-02 18:29
(Received via mailing list)
On Feb 2, 2007, at 11:03 AM, James Edward G. II wrote:

>> irb(main):004:1> end
> A character class ([...]) with a range of 9-1 is not valid in a
> >> num = "10"
> => "10"
> >> num =~ /\A(?:9|1[0123])\Z/
> => 0
> >> num.to_i.between? 9, 13
> => true
>
> Hope that helps.
>
> James Edward G. II

or with a range:

 >> num = "10"
=> "10"
 >> (9..13) === num.to_i
=> true
 >> num = "14"
=> "14"
 >> (9..13) === num.to_i
=> false

You could also have Float values
 >> num = "11.4"
=> "11.4"
 >> (9..13) === num.to_i
=> true
 >> (9..13) === num.to_f
=> true

 >> num = "13.1"
=> "13.1"
 >> (9..13) === num.to_i
=> true
 >> (9..13) === num.to_f
=> false

Since num.to_i is 13.

-Rob

Rob B.    http://agileconsultingllc.com
removed_email_address@domain.invalid
unknown (Guest)
on 2007-02-03 00:30
(Received via mailing list)
On 2/2/07, James Edward G. II <removed_email_address@domain.invalid> wrote:
> encoding.
For comparative reference:

$ perl -e 'print "foo" if 9 =~ /[9-13]/'
Invalid [] range "9-1" in regex; marked by <-- HERE in m/[9-1 <-- HERE
3]/ at -e line 1.

$ python -c "import re; re.match('[9-13]', '9')"
Traceback (most recent call last):
  File "<string>", line 1, in ?
  File "/usr/lib/python2.4/sre.py", line 129, in match
    return _compile(pattern, flags).match(string)
  File "/usr/lib/python2.4/sre.py", line 227, in _compile
    raise error, v # invalid expression
sre_constants.error: bad character range
Robert M. (Guest)
on 2007-02-03 00:39
(Received via mailing list)
are there any filterable subscriptions to this board?
Tim X (Guest)
on 2007-02-03 06:55
(Received via mailing list)
"Nick B." <removed_email_address@domain.invalid> writes:

> (irb):2: invalid regular expression: /[9-13]/
>         from (irb):4
>         from :0
> irb(main):005:0>
>
> I have tested it in ruby 1.8 and 0.9.
>
> Anyone else spotted this?
>

Its not a bug. The problem is you are mixing up characters and numbers.
Regular
expressions work on characters - they don't know the number "13" only
the
characters 1 and 3. In your regexp, you have a character range of 9-1
and 3.
However, 9 is greater than 1, so the range doesn't make sense.

Assuming you want to match on only the numbers 9, 10, 11, 12 and 13, you
have
two basic groupings - a single character '9' or two characters in which
the
first is 1 and the second is in the range 0-3. A possible regexp could
therefore be

/^(?:9|1[0-3])$/

which says match "9" or "10" or "11" or "12" or "13", but don't put it
into the
match variables ((?:...). Note that you probably don't need the ^ and $,
but I
always like to get into the habit of using them where possible as it
anchors
the regexp. If you don't anchor a regexp, you can get really really bad
performance due to loads of backtracking. However, this is more
applicable when
matching strings of text - with only a couple of characters, its not
really and
issue.

I remember seeing a post to the perl group some years ago where someone
was
saying that using aregexp was causing their computer to hang. However,
it
turned out the problem was due to not anchoring the regexp. The computer
wasn't
hung, it was just taking a long long time to perform the matching. As
soon as
the expression was anchored, the "hang" was eliminated.

HTH

Tim
This topic is locked and can not be replied to.