I’ve just installed acts_as_ferret, and am trying to build my index, but
I’m getting the following error:
r = Topic.find_by_contents(‘testing’)
StandardError: : Error occured at <analysis.c>:704
Error: exception 2 not handled: Error decoding input string. Check that
you have the locale set correctly
from
./script/…/config/…/config/…/vendor/plugins/acts_as_ferret/lib/acts_as_ferret.rb:227:in
<<' from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/acts_as_ferret.rb:227:in
rebuild_index’
from
./script/…/config/…/config/…/vendor/plugins/acts_as_ferret/lib/acts_as_ferret.rb:227:in
rebuild_index' from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/acts_as_ferret.rb:247:in
create_index_instance’
from
./script/…/config/…/config/…/vendor/plugins/acts_as_ferret/lib/acts_as_ferret.rb:240:in
ferret_index' from ./script/../config/../config/../vendor/plugins/acts_as_ferret/lib/acts_as_ferret.rb:325:in
find_id_by_contents’
from
./script/…/config/…/config/…/vendor/plugins/acts_as_ferret/lib/acts_as_ferret.rb:262:in
`find_by_contents’
from (irb):1
I’m using the current version of ferret (gem install ferret) and
acts_as_ferret (script/plugin install
svn://projects.jkraemer.net/acts_as_ferret/tags/plugin/stable/acts_as_ferret)
as of today, 7/6/06.
I have tried setting my locale in environment.rb as mentioned here
http://projects.jkraemer.net/acts_as_ferret/wiki/TypoWithFerret (note:
i’m not using typo, but the locale note at the bottom seems to apply).
So in the Rails::Initializer.run block, I’ve put this line: ENV[‘LANG’]
= ‘en_US.UTF-8’
Didn’t make a difference. Any other ideas?
Thanks,
Ian.
Hi Ian,
On Thu, Jul 06, 2006 at 11:58:57PM +0200, Ian Z. wrote:
I’ve just installed acts_as_ferret, and am trying to build my index, but
I’m getting the following error:
r = Topic.find_by_contents(‘testing’)
StandardError: : Error occured at <analysis.c>:704
Error: exception 2 not handled: Error decoding input string. Check that
you have the locale set correctly
[…]
I have tried setting my locale in environment.rb as mentioned here
http://projects.jkraemer.net/acts_as_ferret/wiki/TypoWithFerret (note:
i’m not using typo, but the locale note at the bottom seems to apply).
So in the Rails::Initializer.run block, I’ve put this line: ENV[‘LANG’]
= ‘en_US.UTF-8’
Didn’t make a difference. Any other ideas?
I put this statement at the very top of the file, outside of the block.
Maybe that will do the trick.
You also should make sure the locale exists on your system. On
a Debian-based system, you could do
dpkg-reconfigure locales
and make sure the box before “en_US.UTF-8” is ticked.
Hope this helps,
Jens
–
webit! Gesellschaft für neue Medien mbH www.webit.de
Dipl.-Wirtschaftsingenieur Jens Krämer [email protected]
Schnorrstraße 76 Tel +49 351 46766 0
D-01069 Dresden Fax +49 351 46766 66
Thanks for the response!
I put this statement at the very top of the file, outside of the block.
Maybe that will do the trick.
You also should make sure the locale exists on your system. On
a Debian-based system, you could do
dpkg-reconfigure locales
and make sure the box before “en_US.UTF-8” is ticked.
I determined with locale -a
that the locale on the box is called
“en_US.utf8”, so I added “ENV[‘LANG’] = ‘en_US.utf8’” at the top of my
environment.rb (right after “ENV[‘RAILS_ENV’] ||= ‘production’”.
Still getting the same error: “Error decoding input string. Check that
you have the locale set correctly”

It may be worth noting that it seems to only be a problem with this
particular model. I am able to index a different model without any
issues. So it’s gotta be something with the data.
I noticed that my InnoDB topics table was set to latin1 charset, so I
changed it to utf8. I still get the same error.
Not sure where to go next.
Ian.
On Fri, Jul 07, 2006 at 05:29:50AM +0200, Ian Z. wrote:
I determined with locale -a
that the locale on the box is called
issues. So it’s gotta be something with the data.
I noticed that my InnoDB topics table was set to latin1 charset, so I
changed it to utf8. I still get the same error.
Imho changing the default charset of a table doesn’t change the encoding
of the data stored in it. So that’s still latin1 what you get from your
DB.
Not sure where to go next.
The ENV[‘LANG’] value has to correspond to the encoding of the data you
want to index, so if your data is latin1, Ferret needs to run with such
a locale, i.e. ISO-8859-1.
In such cases I dump the data as text, convert to utf8 (usually
with vim :set fileencoding=utf8), re-create the table with DEFAULT
CHARSET UTF-8 and re-import the data.
With large data sets other solutions might be more efficient, though.
Jens
–
webit! Gesellschaft für neue Medien mbH www.webit.de
Dipl.-Wirtschaftsingenieur Jens Krämer [email protected]
Schnorrstraße 76 Tel +49 351 46766 0
D-01069 Dresden Fax +49 351 46766 66
On 7/7/06, Jens K. [email protected] wrote:
particular model. I am able to index a different model without any
The ENV[‘LANG’] value has to correspond to the encoding of the data you
want to index, so if your data is latin1, Ferret needs to run with such
a locale, i.e. ISO-8859-1.
In such cases I dump the data as text, convert to utf8 (usually
with vim :set fileencoding=utf8), re-create the table with DEFAULT
CHARSET UTF-8 and re-import the data.
With large data sets other solutions might be more efficient, though.
Here’s one way you can convert ISO-8859-1 to UTF-8;
str = str.unpack("C*).map {|c|
if c < 0x80
next c.chr
elsif c < 0xC0
next "\xC2" + c.chr
else
next "\xC3" + (c - 64).chr
end
}.join("")
That may help.
Cheers,
Dave
The ENV[‘LANG’] value has to correspond to the encoding of the data you
want to index, so if your data is latin1, Ferret needs to run with such
a locale, i.e. ISO-8859-1.
In such cases I dump the data as text, convert to utf8 (usually
with vim :set fileencoding=utf8), re-create the table with DEFAULT
CHARSET UTF-8 and re-import the data.
So I tried to get this to work in MANY different ways. I converted the
encoding with vim, iconv, and mysqldump/import. I changed the table
types. I tried this: textsnippets.com , I tried
this:
Converting MySQL Database Contents to UTF-8 – Climb to the Stars
No matter what I try, I get the same error when I change my
environment.rb to en_US.utf8. If I set it to en_US.iso88591, everything
works fine.
If I could successfully convert my database to utf8 AND get it to work
with ferret, I would love to. But I just can’t get it. So… I think
I’m going to stick with latin1 for now. 
Thanks for your help, guys!
Ian.