Forum: Ruby on Rails working with rails and unicode

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
37cd3e8062f4f668b1906c07be98e5aa?d=identicon&s=25 martin aatmaa (Guest)
on 2006-02-05 05:03
(Received via mailing list)
I'm trying to get basic unicode support working using the "Iteration
A1" sample application from the "Agile Web Development With Rails"
book.

Following the "HowToUseUnicodeStrings" wiki document, I have made the
following changes:

config/environment.rb:
# Include your application configuration below
$KCODE = 'u'
require 'jcode'


admin.rhtml:
<head><meta http-equiv="content-type" content="text/html; charset=utf-8"
/>


database.yml:
development:
  adapter: mysql
  encoding: utf8


application.rb:
class ApplicationController < ActionController::Base
  before_filter :set_charset
  after_filter :fix_unicode_for_safari

  def set_charset
    @headers["Content-Type"] = "text/html; charset=utf-8"
  end

  # automatically and transparently fixes utf-8 bug
  # with Safari when using xmlhttp
  def fix_unicode_for_safari
    if @headers["Content-Type"] == "text/html; charset=utf-8" and
    @request.env['HTTP_USER_AGENT'].to_s.include? 'AppleWebKit' and
request.xhr?
    @response.body = @response.body.gsub(/([^\x00-\xa0])/u) { |s|
"&#x%x;" % $1.unpack('U')[0] }
    end
  end
end


And finally create.sql:
) ENGINE=MyISAM DEFAULT CHARSET=utf8


The last step was not mentioned in the wiki guide, but was
nevertheless required in my testing.

So, having performed the above steps, I can now successfully view,
add, and edit entries with unicode text (try, for example, inserting
chinese characters).

I run into problems, however, when I try to import existing unicode
data into the mysql table. I have a utf8 encoded sql query file that
inserts some unicode records into the table. When I try to view these
new records with the admin interface of the sample application, I get
garbage instead of the correct unicode characters. Editing a record
and pasting the correct unicode strings and updating again (all
through the interface) works correctly.

At this point I'm not sure what can be causing the problem. It seems
that anything that's handled outside of the rails application seems to
be incorrectly treated once inside rails.

Has anyone had any experience with this?

Cheers, M
4c3f6ae3543413647e6b3734c184f017?d=identicon&s=25 Paul Butcher (paulbutcher)
on 2006-02-05 12:04
martin aatmaa wrote:
> And finally create.sql:
> ) ENGINE=MyISAM DEFAULT CHARSET=utf8

This step isn't necessary if you set the default character set for your
schema to utf8:

CREATE DATABASE mydatabase DEFAULT CHARSET utf8

Now every table you create will use utf8 unless you specify otherwise.

> I run into problems, however, when I try to import existing unicode
> data into the mysql table. I have a utf8 encoded sql query file that
> inserts some unicode records into the table. When I try to view these
> new records with the admin interface of the sample application, I get
> garbage instead of the correct unicode characters. Editing a record
> and pasting the correct unicode strings and updating again (all
> through the interface) works correctly.

My guess is that the problem is that your data is not being inserted
correctly by your "utf8 encoded sql". If you view the data within the
MySQL Query Browser does it appear as you would expect (my guess is that
you'll see the same garbled data that you see in the admin interface)?

The character set handling within MySQL is complicated to say the least.
It took me quite a lot of reading the documentation and experimenting to
work it out. If your problem is what I think it is, then what's
happening is that MySQL thinks that your utf8 encoded SQL is actually
ISO-8859 encoded. You can tell it that it's utf8 by adding this line to
the top of your SQL:

SET NAMES utf8

It's worth reading the MySQL documentation about character sets
carefully. The way that it works isn't obvious (to me, at least!).

Hope this is some help,

paul.butcher->msgCount++

Snetterton, Castle Combe, Cadwell Park...
Who says I have a one track mind?
37cd3e8062f4f668b1906c07be98e5aa?d=identicon&s=25 martin (Guest)
on 2006-02-19 19:15
(Received via mailing list)
Paul, thank you for the response.

Your suggestion that it is a mysql problem turned out to be true. It
seems that I am using the command line mysql command incorrectly, and
using the "source" command somehow  incorrectly reads my SQL script
(located in a file).

For the time being I run the script from mySQLFront, which handles the
data correctly.

Cheers,
Martin
This topic is locked and can not be replied to.