Hi there!
I’ve experienced a courious issue with regular expressions: my locale is
es_ES.utf-8 and I use gsub with regular expresisons to transform an
article
title in a permalink like this
permalink.gsub!(/\W/, ‘-’)
What’s the problem? In my local system I get this
“¿” =~ /\W/
=> 0
but in another systems with english locale I have this other result
“¿” =~ /\W/
=> nil
Is this a bug or regular expressions matching depends on system locale?
but in another systems with english locale I have this other result
“¿” =~ /\W/
=> nil
Is this a bug or regular expressions matching depends on system locale?
(1) What exact version(s) of Ruby are you running? (Show
RUBY_DESCRIPTION constant). Behaviour varies between versions.
(2) What does
"¿".encoding
show on the two machines?
AFAIK the actual match should depend only on the encoding of the string,
not the system locale in the environment - but if you find differently
that would be of interest.
(3) It looks like you are doing this in IRB. IRB is not a good predictor
of behaviour in ruby 1.9, since the encoding of string literals in IRB
depends on the system locale - which is not true for ruby source code in
source files.
So writing a small test .rb file and running that is probably better.
I’ve attempted to document what I’ve found so far about encoding
behaviour in 1.9 here:
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.