Hello,
I want to clean up accented characters in my index, using acts_as_ferret
in a Rails project. I searched this forum, and found the best solution
is to use an analyser.
I created somthing like this:
class PortugueseAnalyzer
include Ferret::Analysis
MAPPING = {
['à ',‘á’,‘â’,‘ã’,‘ä’,‘Ã¥’,‘Ä’,‘ă’] => ‘a’,
‘æ’ => ‘ae’,
[‘Ä’,‘Ä‘’] => ‘d’,
[‘ç’,‘ć’,‘Ä’,‘ĉ’,‘Ä‹’] => ‘c’,
[‘è’,‘é’,‘ê’,‘ë’,‘Ä“’,‘Ä™’,‘Ä›’,‘Ä•’,‘Ä—’,] => ‘e’,
[‘Æ’’] => ‘f’,
[‘Ä’,‘ÄŸ’,‘Ä¡’,‘Ä£’] => ‘g’,
[‘Ä¥’,‘ħ’] => ‘h’,
[‘ì’,‘ì’,‘Ã’,‘î’,‘ï’,‘Ä«’,‘Ä©’,‘Ä’] => ‘i’,
[‘į’,‘ı’,‘ij’,‘ĵ’] => ‘j’,
[‘Ä·’,‘ĸ’] => ‘k’,
[‘Å‚’,‘ľ’,‘ĺ’,‘ļ’,‘Å€’] => ‘l’,
[‘ñ’,‘Å„’,‘ň’,‘ņ’,‘ʼn’,‘Å‹’] => ‘n’,
[‘ò’,‘ó’,‘ô’,‘õ’,‘ö’,‘ø’,‘Å’,‘Å‘’,‘Å’,‘Å’] => ‘o’,
[‘Å“’] => ‘oek’,
[‘Ä…’] => ‘q’,
[‘Å•’,‘Å™’,‘Å—’] => ‘r’,
[‘Å›’,‘Å¡’,‘ÅŸ’,‘Å’,‘È™’] => ‘s’,
[‘Å¥’,‘Å£’,‘ŧ’,‘È›’] => ‘t’,
[‘ù’,‘ú’,‘û’,‘ü’,‘Å«’,‘ů’,‘ű’,‘Å’,‘Å©’,‘ų’] => ‘u’,
[‘ŵ’] => ‘w’,
[‘ý’,‘ÿ’,‘Å·’] => ‘y’,
[‘ž’,‘ż’,‘ź’] => ‘z’
}
def token_stream(field, string)
return MappingFilter.new(StandardTokenizer.new(string), MAPPING)
end
end
And inserted this code at the end of environment.rb.
Im my model:
acts_as_ferret({ :fields => [ ‘name’ ] }, :analyzer =>
PortugueseAnalyzer.new)
But this did not work…
Can someone tell me what I did wrong ???
Thanks
Marcello