Premature end of regular expression with non-ascii character

Hi,

I’m trying to get regular expressions to work with a string that
contains letters with accents. I have the following sentence:

De kiné weet één van hun patiënten te overtuigen om gekke dingen te
doen.

The regexp /patiënten/ matches the word patiënten. However when I do the
regexp /kiné/, I get the error ‘premature end of regular expression:
/kiné/ (SyntaxError)’. Can anybody tell me what is going on? Another
issue with the same sentence is, when I use the regexp /\s/ to highlight
all the spaces, the space between ‘kiné weet’ is not highlighted as a
space. It seems like regular expressions cann’t handle non-ascii
characters at the end of a string.

Kind regards,

Nick

issue with the same sentence is, when I use the regexp /\s/ to
highlight
all the spaces, the space between ‘kiné weet’ is not highlighted as a
space. It seems like regular expressions cann’t handle non-ascii
characters at the end of a string.

I believe this is a character encoding problem which is fixed in 1.9
by the inclusion of a new regular expression engine (Which you can
also download and use in 1.8):

http://www.geocities.jp/kosako3/oniguruma/

Best of luck.
matt.

On Jan 29, 2006, at 3:53 PM, Nick S. wrote:

regexp /kiné/, I get the error 'premature end of regular expression:


Posted via http://www.ruby-forum.com/.

Are you using $KCODE=“u” at the top of your script?

Thank you both very much for the suggestions. First off I have
$KCODE=“u” in config/environment.rb (Rails). I have also tried to add it
into the class. But the error remained.

Secondly I looked at oniguruma and I must say it looks promising.
Unfortunately for me and my Windows (Cygwin) machine I have to compile
it into Ruby 1.8.2-1.8.4. And I cann’t get it to work. Cann’t get 1.8.2
to compile, an error which you then solve, yet another error and so one.
Hopeless. I managed to compile 1.8.4 but when I open Ruby I get the
error that a file is missing. I’m using the Windows one-click Ruby
installer if anybody is wondering how on earth I managed to get Ruby
working :). I could use 1.9.0 because this includes oniguruma. The only
problem here is that I don’t know if Rails works with it. I have
contacted the author of oniguruma, maybe he can be conclusive as to
whether or not oniguruma solves my problem. When I get a response I’ll
post it here. In the mean time if anybody has any other suggestions,
please let me hear. Thanks.

Kind regards,

Nick

Nick S. asked:

I’m trying to get regular expressions to work with a string that
contains letters with accents. …

The regexp /patiënten/ matches the word patiënten. However when I do the
regexp /kiné/, I get the error ‘premature end of regular expression:
/kiné/ (SyntaxError)’. Can anybody tell me what is going on?

You might avoid the syntax error by setting $KCODE = “u” at the start of
your program.

Another
issue with the same sentence is, when I use the regexp /\s/ to highlight
all the spaces, the space between ‘kiné weet’ is not highlighted as a
space. It seems like regular expressions cann’t handle non-ascii
characters at the end of a string.

Ruby strings are made up of bytes, not characters. That’s the cause of
the
issues you’re having. There are a couple of recent plugins for Ruby to
help
improve the situation (see
http://redhanded.hobix.com/inspect/unicodeLibForRuby18.html) but they’re
far
from perfect.

I hope $KCODE can clear up most of your problems, though.

Cheers,
Dave

Nick S. wrote:

Thank you both very much for the suggestions. First off I have
$KCODE=“u” in config/environment.rb (Rails). I have also tried to add it
into the class. But the error remained.

I haven’t had the issues you’re talking about, because I’m only doing
apps
in English, but here are a couple of places you might start to look for
solutions:

http://wiki.rubyonrails.com/rails/pages/HowToUseUnicodeStrings

http://redhanded.hobix.com/inspect/unicodeLibForRuby18.html

I could use 1.9.0 because this includes oniguruma. The only
problem here is that I don’t know if Rails works with it.

Don’t. 1.9.0 isn’t for production, really; it’s an experimental version
which is growing some features that may become part of Ruby 2.0.

Cheers,
Dave

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs