Varying case sensitivity

Hi all,

I’m using ferret 11.4 together with acts_as_ferret and I’ve indexed the
geonames.org country files. These files contain worldwide locations in
UTF-8 with all their different spellings each.

Model definition is like this:

class location
acts_as_ferret :fields => {:location_names => {}}, :single_index =>
true

end

The instance method location_names returns a string containing all the
different, UTF-8 coded spellings for this location.

Problem:

Sometimes the search is case sensitive and sometimes not. E.g. it finds
“stuttgart” and “Stuttgart”. It finds “München” but does NOT find
“münchen”. It only finds “Ãœberlingen” and not “überlingen”.

My feeling is that for locations with “special characters” it behaves
case sensitive…

My goal is not to be case sensitive.

Thanks for your help,

Starburger

BTW my locale settings in environment.rb are

ENV[‘LANG’] = ‘[email protected]
ENV[‘LC_TIME’] = ‘C’
require ‘acts_as_ferret’

Reply is below the quote.

Star B. wrote:

true
“stuttgart” and “Stuttgart”. It finds “München” but does NOT find
“münchen”. It only finds “Ãœberlingen” and not “überlingen”.

My feeling is that for locations with “special characters” it behaves
case sensitive…

My goal is not to be case sensitive.

Thanks for your help,

Starburger

Star B. wrote:

BTW my locale settings in environment.rb are

ENV[‘LANG’] = ‘[email protected]
ENV[‘LC_TIME’] = ‘C’
require ‘acts_as_ferret’

Ferret’s LowerCaseFilter (which converts tokens and queries to lower
case) uses the C function towlower() 1 to convert multi-byte
characters (e.g. UTF-8 characters with accents) to lower case. Maybe
the Ferret code does not inherit the correct locale from environment.rb?
I’m not sure how to fix this, perhaps someone else does.

-Stuart