Strange unicode regex behavior with Ruby 2.0

Hi,

the following code behaves strangely in ruby 2.0, and different from
1.9:

puts “ä”.match(/[\p{Word}]/).inspect
puts “ä”.match(/[\p{Word}\s]/).inspect

Result on 1.9.3-p194:
#<MatchData “ä”>
#<MatchData “ä”>

Result on 2.0.0-p0:
#<MatchData “ä”>
nil

Any ideas what’s going on there?

I have attached the ruby code as a file, in case there are any problems
with email charset conversion.

Thanks!
Andreas

If a bug had been inserted, it appears it has been removed already:

$ ruby -v
ruby 2.1.0dev (2013-03-25 trunk 39928) [x86_64-linux]
$ ruby test_regex.rb
#<MatchData “ä”>
#<MatchData “ä”>

Carlo

Hey Andreas,

Lately, there have been some discussions on ruby-core (the mailing list
dedicated to the core implementers of MRI). It’s possible that this bug
is
being adressed at the moment. These are the most recent messages there:

http://blade.nagaokaut.ac.jp/ruby/ruby-core/53601-53800.shtml#latest

I don’t really know what could be causing this difference. :frowning:


Carlos A.
Skype: carlos.agarie

Control engineering
Polytechnic School, University of So Paulo, Brazil
Computer engineering
Embry-Riddle Aeronautical University, USA

2013/3/26 Andreas S. [email protected]

Thanks! Guess I will use a workaround for now and wait for 2.1.