How to deal with accentuated chars in 0.10.8?

I’m startin to use Ferret and acts_as_ferret.

I need to use something like EuropeanAnalyzer
(http://olivier.liquid-concept.com/fr/pages/2006_acts_as_ferret_accentuated_chars).

By example, if the user search by “gonzalez” you can find documents taht
contents the term “gonzález” (gonzález)

The EuropeanAnalyzer is based on Ferret::Analysis::TokenFilter, but
seems that in 0.10.x this is not available.

What is the way to do this ?

On 10/20/06, Edgar [email protected] wrote:

What is the way to do this ?

try this. Make sure you use the -KU flag.

require ‘rubygems’
require ‘ferret’
require ‘jcode’

ACCENTUATED_CHARS =
'ÅÄÀAÂåäàâaÖÔôöÉÈÊËéèêëÜüùç’REPLACEMENT_CHARS = ‘aaaaaaaaaaooooeeeeeeeeuuuc’

module Ferret::Analysis
class TokenFilter < TokenStream
# Construct a token stream filtering the given input.
def initialize(input)
@input = input
end
end

replace accentuated chars with ASCII one

class ToASCIIFilter < TokenFilter
def next()
token = @input.next()
unless token.nil?
token.text = token.text.downcase.tr(ACCENTUATED_CHARS,
REPLACEMENT_CHARS)
end
token
end
end

class EuropeanAnalyzer
def token_stream(field, string)
return ToASCIIFilter.new(StandardTokenizer.new(string))
end
end
end

analyzer = Ferret::Analysis::EuropeanAnalyzer.new
ts = analyzer.token_stream(‘xxx’, "Let’s see what " +
“happens to
ÅÄÀAÂåäàâaÖÔôöÉÈÊËéèêëÜüùç”)while t = ts.next
puts t
end

David,

Thanks for the tip, but I’ll try your latest release (0.10.13) :slight_smile:

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs