Bug ? with stop words in French UTF-8 locale

Strange thing.
Minimal example :
Indexing a three accented words text like “aprèl après aprèt” and asking
for one of the three words, then two cases appear :

  • plain indexing : all three give a hit,
  • indexing with FULL_FRENCH_STOP_WORDS, only one (“après”) gives a hit.

I made extensive checks : no clear pattern appears for what type of
accented words work and what do not : f.i. “Hélène” does not work,
“Jérôme” works…

By the way, the list of French stop words appearing in stopwords.c is
strange, as some of them do not exist in the French language (flexed
participles…).

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs