Strange thing. Minimal example : Indexing a three accented words text like "aprÃ¨l aprÃ¨s aprÃ¨t" and asking for one of the three words, then two cases appear : - plain indexing : all three give a hit, - indexing with FULL_FRENCH_STOP_WORDS, only one ("aprÃ¨s") gives a hit. I made extensive checks : no clear pattern appears for what type of accented words work and what do not : f.i. "HÃ©lÃ¨ne" does not work, "JÃ©rÃ´me" works... By the way, the list of French stop words appearing in stopwords.c is strange, as some of them do not exist in the French language (flexed participles...).
on 2009-03-16 14:09