the last couple of days I’m trying to index some txt files. Once indexed
I have the habit of checking the contents of the Ferret index with Luke.
But everytime I tried to open the index I got a ‘read past EOF’ error. I
managed to get it down to the way Ferret handles non-ascii characters. I
have one txt file with the following content ‘a o b c’ and one with 'Ã© Ã¨
Ã§ Ã ’ . If I index the first one I can read the index perfectly, however
when I index the second one I get the EOF error. The error is with the
standard and whitespace analyzers. The stop analyzer just ignores these
characters. How can I solve this, so that Ferret handles these ‘special’
characters correctly. Thanks.