Detecting Characters other the ASCII (Other than English)

Ibrahim_Mokdad · May 25, 2009, 1:55pm

Dear all
I kinda need help on a project I’m working on; and I’m stuck on the
part were I have to detect any Unicode character in the text file;
will regular expressions “\w” work ?
thnx in advance

Ibrahim_Mokdad · May 25, 2009, 6:27pm

Ibrahim Mokdad wrote:

Dear all
I kinda need help on a project I’m working on; and I’m stuck on the
part were I have to detect any Unicode character in the text file;
will regular expressions “\w” work ?
thnx in advance

Why not . (period) in a regular expression? That should do what you
want.

Best,

Marnen Laibow-Koser
http://www.marnen.org
[email protected]

Ibrahim_Mokdad · May 25, 2009, 7:35pm

No /w will not work. And . (period) will not work either.
here is asdjflawæ—¥æœ¬erjocd some text
the japanese within this text looks like this \346\227\245\346\234\254
in unicode.
it would not be matched by /w (letter or number set) or . (any
character).
each of the \ddd sets in the unicode character would be matched by a .
(period)

I am sure someone has a solution for this, but it is not me.

unicode geniuses HEEEEELLLLLP ãã ã•ã„ã€‚
tim

On May 25, 9:27Â am, Marnen Laibow-Koser <rails-mailing-l…@andreas-