Find_by_ferret and UTF-8


#1

Hi all,

So I scoured the Internet and it’s possible I read the answer but didn’t
know I was looking at it (because I am beginning programmer) but if
someone could explain this it would be much appreciated. The problem is
I am having trouble finding a UTF-8 term with the find_by_ferret method.

I am using
ruby 1.8.6 (2008-03-03 patchlevel 114) [x86_64-linux]
Rails 2.2.2
ferret (0.11.6)
acts_as_ferret (0.4.3)

In my controller I have @results =
Product.find_with_ferret(params[:query])
I have a single product called “sony æ•°”
In my model I have acts_as_ferret :fields => { :name => { :store => :yes
}}

I am able to use my Rails app to obtain a successful hit on the index
when :query => “sony” but no luck when :query => “æ•°”

Debug steps:

  1. So using ferret-browser, I have confirmed that this document is in
    the index and so are the terms “sony” and “æ•°”. So I know it’s not a
    ferret issue. This leaves Rails and acts_as_ferret

  2. when I switch my controller to @results =
    [Product.find_by_name(params[:query])]
    I am able to find the product so it seems it is not a Rails issue.

  3. A bunch of blogs say use $KCODE=‘u’ and require ‘jcode’ in
    environment.rb but they seem to be outdated because it appears that
    $KCODE=‘u’ is already hardcoded into initializer.rb and isn’t the Chars
    class in Rails supposed to wrap String nicely, so that jcode isn’t
    needed anymore?

  4. If the answer lies here http://rm.jkraemer.net/wiki/aaf#13
    I can’t tell what I am supposed to do. They seem like a bunch of
    solutions to problems that are not similar to mine.

  5. Lastly, it seems like Ruby 1.8.6 recognizes UTF-8 based on following
    the examples here
    http://blog.grayproductions.net/articles/the_kcode_variable_and_jcode_library
    Not sure I understand why I have to use 1.9.

fyi, æ•° is equivalently %E6%95%B0 in UTF-8 and I tried passing :query =>
“%E6%95%B0” and no luck either.

Thank you,
Richard