UTF8 strings in models, views and controllers

Hi,

I have set up MySQL to use Unicode and everything works fine as far as
data is concerned. Whenever I try to change the scaffold view for a
model, and replace English titles or field names for example with
Greek ones, I get question marks instead of the Greek letters. If I
replace a string inside a model or a controller with Greek letters,
the code doesn’t run and I get strange errors by the Ruby interpreter.

What should I do? Is there a special configuration option or a way to
tell Rails that I want my files in Unicode?

Thanks,
Petros

Hello Petros,

2007/3/20, Petros A. [email protected]:

What should I do? Is there a special configuration option or a way to
tell Rails that I want my files in Unicode?

Hmmm, that’s strange. I have used French accented characters in my
views, and they worked fine.

For working with strings, you should look as the
ActiveSupport::Multibyte::Chars module:

Hope that helps !

François Beausoleil
http://blog.teksol.info/
http://piston.rubyforge.org/

On 3/20/07, François Beausoleil [email protected] wrote:

Hello Petros,

2007/3/20, Petros A. [email protected]:

What should I do? Is there a special configuration option or a way to
tell Rails that I want my files in Unicode?

Hmmm, that’s strange. I have used French accented characters in my
views, and they worked fine.

Thats because French is Latin-1, or basically the default character set
anyway.
Cyrillic/Greek is a different character set.

For working with strings, you should look as the
ActiveSupport::Multibyte::Chars module:
ActiveSupport::Multibyte::Chars

Yes, Unicode support is one of the major features added into Rails 1.2
If you are not using Rails 1.2, there are still ways to achieve
effective UTF-8
support. There are multiple articles about this in the Rubyonrails main
site.

I would check those out Petros, some of the things you were experiencing
were quite unusual (like the interpreter whining about it).

regards,
Richard.

On 3/20/07, Petros A. [email protected] wrote:

This is the first time I am dealing with these issues, so forgive me
for asking stupid questions.

There are no stupid questions when it comes to unicode.

Could it be the file format that rails is
generating the problem? If I try to convert the file to Unicode using
ConText, everything is being displayed with questions marks. Even the
English text. Should I ever have to mess with the file format?

You should be using UTF-8, not unicode (which generally means 16-bit
unicode).

It seems to be that you have issues with your scaffold generated code,
i.e. it is not generating UTF-8 sourcefiles. The windows command shell
may be messing about here - I am not saying its the problem here, but
don’t discount it.

There are quite a few things you have to do to get UTF-8 working right.
I don’t have the full list, but the one thing that is practically a
requirement
for success is that you have round-trip UTF-8. Basically you do no
charset conversion.

I would check out the full list on the Rails website.

regards,
Richard.

Make sure ConText isn’t doing weird stuff to your files. Try opening
it up with Notepad – the Windows XP version does UTF-8 correctly. If
you still see question marks then ConText is mucking up your files.

If you see it showing up fine in Notepad, then double check your
browser settings. Make sure the encoding you’re using is really
UTF-8. You can check this in FF2 under View -> Character Encoding.

My best guess at the culprit here is ConText…

Thank you for the links and suggestions. I have read various articles
about this issue, but unfortunately I have a more basic problem. Let
me give some information about my environment. My machine runs Windows
XP. I have installed Ruby 1.8.5 and Rails 1.2. When I use script/
generate to create a scaffold, the files are created in Unix format. I
use ConText editor to change an rhtml view, for example the title
which is straightforward:

Spareparts

to

Some Greek word that means Spareparts

When I go to the browser and bring up that view, the Greek is not
displayed and instead I see question marks. If I check the headers I
can see that it is UTF8 and the browser finds that too. This is
because I used Application.rb to set up a filter in order to change
the charset to UTF8 for each view of the application.

This is the first time I am dealing with these issues, so forgive me
for asking stupid questions. Could it be the file format that rails is
generating the problem? If I try to convert the file to Unicode using
ConText, everything is being displayed with questions marks. Even the
English text. Should I ever have to mess with the file format?

Petros

Thank you all for the suggestions and help!!!

eden li and the rest, you were right. The problem was with ConText. As
for the generated files, they are not in UTF8 nor Unicode and I tried
to convert them, but didn’t know that Unicode is not UTF8. ConText
converts only to Unicode. I switched to SciTE, which converts to UTF8
and everything displays OK! It would be nice if the generator itself,
generated UTF8 files. Maybe, when my level goes up from newbie to
hacker I can take a shot in creating a generator that takes care of
this.

I can now see Greek.

Thanks again,
Petros