Hi-
I was previously using 0.11.4, and I wrote my own analyzer. Everything
worked fine.
When I took the system to production, 0.11.4 starting failing updating
the index, complaining that files were missing. The failure always
happened on the same model document, and was completely reproducible.
This failure looked a lot like the one described at
http://www.ruby-forum.com/topic/104145.
I reverted to 0.11.3, and all my model documents index fine (over 3M
documents). However, as I later found out, my custom analyzer was
returning bogus data, so the index as currently built is useless.
What I observe is that, if I specify a custom analyzer using the
:analyzer option to acts_as_ferret, the calls to my custom analyzer are
fine when using Ferret 0.11.4. However, when I reverted back to 0.11.3,
calls to my analyzer’s token_stream method always have a blank string.
That is, the “input” parameter to
http://ferret.davebalmain.com/api/classes/Ferret/Analysis/Analyzer.html#M000324
is always a blank string. The field_name parameter is correct for both
0.11.4 and 0.11.3.
So, now, I’m in a bad situation. My custom analyzer works with 0.11.4,
but 0.11.4 fails to index my corpus. 0.11.3 will index my entire
corpus, but my custom analyzer fails, apparently due to some calling
convention differences between 0.11.3 and 0.11.4.
Does this ring a bell to anyone? I’m stuck and I would appreciate any
help I can get.
Best Regards,
Danny