Custom analyzer weirdness with 0.11.3

dburkes · May 4, 2007, 12:34am

Hi-

I was previously using 0.11.4, and I wrote my own analyzer. Everything
worked fine.

When I took the system to production, 0.11.4 starting failing updating
the index, complaining that files were missing. The failure always
happened on the same model document, and was completely reproducible.
This failure looked a lot like the one described at
Constant 0.11.4 Errors - Ferret - Ruby-Forum.

I reverted to 0.11.3, and all my model documents index fine (over 3M
documents). However, as I later found out, my custom analyzer was
returning bogus data, so the index as currently built is useless.

What I observe is that, if I specify a custom analyzer using the
:analyzer option to acts_as_ferret, the calls to my custom analyzer are
fine when using Ferret 0.11.4. However, when I reverted back to 0.11.3,
calls to my analyzer’s token_stream method always have a blank string.
That is, the “input” parameter to
http://ferret.davebalmain.com/api/classes/Ferret/Analysis/Analyzer.html#M000324
is always a blank string. The field_name parameter is correct for both
0.11.4 and 0.11.3.

So, now, I’m in a bad situation. My custom analyzer works with 0.11.4,
but 0.11.4 fails to index my corpus. 0.11.3 will index my entire
corpus, but my custom analyzer fails, apparently due to some calling
convention differences between 0.11.3 and 0.11.4.

Does this ring a bell to anyone? I’m stuck and I would appreciate any
help I can get.

Best Regards,

Danny

dburkes · May 4, 2007, 10:19am

On Fri, May 04, 2007 at 12:34:38AM +0200, Danny B. wrote:

Hi-

I was previously using 0.11.4, and I wrote my own analyzer. Everything
worked fine.

When I took the system to production, 0.11.4 starting failing updating
the index, complaining that files were missing. The failure always
happened on the same model document, and was completely reproducible.
This failure looked a lot like the one described at
Constant 0.11.4 Errors - Ferret - Ruby-Forum.

Bad you still have this problem. Did you try to run Ferret’s unit tests
on that Mac?

is always a blank string. The field_name parameter is correct for both
0.11.4 and 0.11.3.

There was a conversation about this issue here right before 0.11.4 was
released, where Dave explains what is happening:
http://www.ruby-forum.com/topic/103004#231032

I’m not sure but maybe with the help of that posting you could change
your
analyzer to work with 0.11.3…

jens

–
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

dburkes · May 4, 2007, 3:27pm

When I took the system to production, 0.11.4 starting failing updating
the index, complaining that files were missing. The failure always
happened on the same model document, and was completely reproducible.
This failure looked a lot like the one described at
Constant 0.11.4 Errors - Ferret - Ruby-Forum.

Bad you still have this problem. Did you try to run Ferret’s unit tests
on that Mac?

I didn’t, but what I am describing here is a different problem than the
one I previously described on OS X
(Mongrel segfaults - Ferret - Ruby-Forum). This new bug occurs in our
production environment, running on Ubuntu 6.10.

is always a blank string. The field_name parameter is correct for both
0.11.4 and 0.11.3.

There was a conversation about this issue here right before 0.11.4 was
released, where Dave explains what is happening:
Trouble with PerFieldAnalyzer - Ferret - Ruby-Forum

I’m not sure but maybe with the help of that posting you could change
your
analyzer to work with 0.11.3…

Thanks, I’ve read that thread and I think I understand what I need to do
to get my custom analyzer working with 0.11.3. I’ll go that route for
now.

Thanks for your help!

Best Regards,

Danny