Utf8 encoding problem

Hi,
I am retrieving a string from a txt file.
The file contains some utf8 characters.

I am comparing these characters against a default string.

The problem is that some of the characters are not stored in a default
format.

For example:
A is stored as A

Naturally when I compare the character it fails.
Strangely when I unpacked the character it appears as 65313 which is the
correct utf8 number for A.

Any way around this?

thanks.

On Jun 25, 2009, at 14:29, Ad Ad wrote:

A is stored as A

Naturally when I compare the character it fails.
Strangely when I unpacked the character it appears as 65313 which is
the
correct utf8 number for A.

Any way around this?

Well, A is “Fullwidth Latin Capital Letter A” from the “Hiragana and
Katakana” category (Unicode FF21) whereas A is “Latin Capital Letter
A” from the “Latin” category (Unicode 0041).

I don’t know of a way to translate between the two categories, but
maybe that will help.

Although I haven’t tried it myself, I did a search for
e$BA43QH>3QJQ49e(B and
found this page.
It appears people use jcode and tr to solve this problem.

http://www.eml.ele.cst.nihon-u.ac.jp/~momma/wiki/wiki.cgi/Ruby/全角半角変換.html
http://blog.grayproductions.net/articles/the_kcode_variable_and_jcode_library

2009/6/25 Eric H. [email protected]:

James Rubingh wrote:

Although I haven’t tried it myself, I did a search for
e$BA43QH>3QJQ49e(B and
found this page.
It appears people use jcode and tr to solve this problem.

http://www.eml.ele.cst.nihon-u.ac.jp/~momma/wiki/wiki.cgi/Ruby/全角半角変換.html
http://blog.grayproductions.net/articles/the_kcode_variable_and_jcode_library

2009/6/25 Eric H. [email protected]:

brilliant!
str.tr!(‘a-zA-Z’,‘a-zA-z’) worked like a charm. :slight_smile: