or maybe not a bug :S
so back to zero 
require ‘rubygems’
require ‘ferret’
Ferret.locale = ‘’ #“de_DE.iso88591”
i = Ferret::I.new
i << ‘Übersicht’
i << ‘übersicht’
for q in [ ‘Übersicht’, ‘übersicht’, ‘Über*’, ‘über*’, ‘*bersicht’ ]
puts “#{q} : #{i.search(q).total_hits} hit(s)”
end
with an empty locale in the test script it’ll work in the new version as
well.
but in my rails app the aaf generated index will have broken umlauts
with an empty Ferret.locale.
e.g. the word “Übersicht” in the index shows this behavior when queried:
“Übersicht” = hit
“übersicht” = hit
“Übers*” = no hit
“übers*” = no hit
“bersicht” = hit (?!?!)
with a locale set to “de_DE.iso88591” the umlauts seem correct but case
sensitive.
Query
“Übersicht” = hit
“übersicht” = no hit
“Übers*” = hit
“ÜBERSICHT” = hit
“üBERSICHT” = no hit
“ÜBERsi*” = hit
i simplified my model a bit to speed up the 200 index rebuilds i’ve done
the last days:
acts_as_ferret( { :fields => [ :title ], :remote => true }, {
:analyzer => GermanStemmingAnalyzer.new } )
def title
Iconv.new(‘ISO-8859-1’, ‘UTF-8’).iconv(self.xstrtitle.to_s)
end
here are a couple of terms from the index:
[“massnahm”,2],
[“medi”,1],
[“medikament”,1],
[“patientenwert”,1],
[“patientinn”,1],
[“prufprotokoll”,1],
[“regionalanasthesi”,2],
[“reisekostenabrechn”,1],
[“reparaturanzeig”,2],
[“schwachelt”,1],
[“sonderw”,1],
[“ssnahmenkurz”,1],
[“stundenabrechn”,2],
[“sturzereignisprotokoll”,1],
[“urlaubsubertrag”,1],
[“verwalt”,1],
[“zuschlagsformular”,1],
[“zytostatica”,2],
[“Äquivalenzdos”,1],
[“Übergabeprotokoll”,1],
[“Übersicht”,1],
[“Überstundendokumentation”,1]]
the lowercase umlauts seem to be properly processed by the lowercase
filter through the stemming analyzer, just the four terms on the end
that start with uppercase umlauts are unprocessed 
any idea? i can’t think of anything else i could try (except solr) 