Ferret locks up when adding items to an index

I’m running Ferret 0.9.5 on a MacBook Pro (OS X 10.4.7) under
Locomotive 2.0.7.

I have a problem where Ferret is hanging when I try to add items to
the index. It doesn’t happen with every object that’s being indexed,
and I’m not sure what the objects in question have in common (they
are not all instances of the same ActiveRecord object). The process
always locks up on the same line (tokenizers.rb line 49).

I have other colleagues running the same version of Ferret on the
same platform (Locomotive under OS X) and they’re not seeing the same
behavior. Anyone have any ideas?

I have tried reinstalling Ferret, and both Ferret and Locomotive.

Here’s the difficult to read stack trace:

     from /Applications/Locomotive2/Bundles/rails112.locobundle/

i386/lib/ruby/gems/1.8/gems/ferret-0.9.5/lib/ferret/analysis/
tokenizers.rb:49:in next' from /Applications/Locomotive2/Bundles/rails112.locobundle/ i386/lib/ruby/gems/1.8/gems/ferret-0.9.5/lib/ferret/analysis/ token_filters.rb:21:innext’
from /Applications/Locomotive2/Bundles/rails112.locobundle/
i386/lib/ruby/gems/1.8/gems/ferret-0.9.5/lib/ferret/analysis/
token_filters.rb:52:in next' from /Applications/Locomotive2/Bundles/rails112.locobundle/ i386/lib/ruby/gems/1.8/gems/ferret-0.9.5/lib/ferret/index/ document_writer.rb:122:ininvert_document’
from /Applications/Locomotive2/Bundles/rails112.locobundle/
i386/lib/ruby/gems/1.8/gems/ferret-0.9.5/lib/ferret/index/
document_writer.rb:88:in invert_document' from /Applications/Locomotive2/Bundles/rails112.locobundle/ i386/lib/ruby/gems/1.8/gems/ferret-0.9.5/lib/ferret/index/ document_writer.rb:58:inadd_document’
from /Applications/Locomotive2/Bundles/rails112.locobundle/
i386/lib/ruby/gems/1.8/gems/ferret-0.9.5/lib/ferret/index/
index_writer.rb:158:in add_document' from /Applications/Locomotive2/Bundles/rails112.locobundle/ i386/lib/ruby/gems/1.8/gems/ferret-0.9.5/lib/ferret/index/index.rb: 298:in<<’
from /Applications/Locomotive2/Bundles/rails112.locobundle/
i386/lib/ruby/1.8/monitor.rb:229:in synchronize' from /Applications/Locomotive2/Bundles/rails112.locobundle/ i386/lib/ruby/gems/1.8/gems/ferret-0.9.5/lib/ferret/index/index.rb: 258:in<<’
from ./script/…/config/…/config/…/vendor/plugins/
acts_as_ferret/lib/acts_as_ferret.rb:251:in rebuild_index' from ./script/../config/../config/../vendor/plugins/ acts_as_ferret/lib/acts_as_ferret.rb:250:inrebuild_index’
from ./script/…/config/…/config/…/vendor/plugins/
acts_as_ferret/lib/acts_as_ferret.rb:249:in rebuild_index' from /Applications/Locomotive2/Bundles/rails112.locobundle/ i386/lib/ruby/gems/1.8/gems/activerecord-1.14.4/lib/active_record/ connection_adapters/abstract/database_statements.rb:51:intransaction’
from /Applications/Locomotive2/Bundles/rails112.locobundle/
i386/lib/ruby/gems/1.8/gems/activerecord-1.14.4/lib/active_record/
transactions.rb:91:in transaction' from ./script/../config/../config/../vendor/plugins/ acts_as_ferret/lib/acts_as_ferret.rb:248:inrebuild_index’
from ./script/…/config/…/config/…/vendor/plugins/
acts_as_ferret/lib/acts_as_ferret.rb:246:in `rebuild_index’

On 8/18/06, Rafe C. [email protected] wrote:

same platform (Locomotive under OS X) and they’re not seeing the same
i386/lib/ruby/gems/1.8/gems/ferret-0.9.5/lib/ferret/analysis/
from /Applications/Locomotive2/Bundles/rails112.locobundle/
from /Applications/Locomotive2/Bundles/rails112.locobundle/
connection_adapters/abstract/database_statements.rb:51:in transaction' from /Applications/Locomotive2/Bundles/rails112.locobundle/ i386/lib/ruby/gems/1.8/gems/activerecord-1.14.4/lib/active_record/ transactions.rb:91:in transaction’
from ./script/…/config/…/config/…/vendor/plugins/
acts_as_ferret/lib/acts_as_ferret.rb:248:in rebuild_index' from ./script/../config/../config/../vendor/plugins/ acts_as_ferret/lib/acts_as_ferret.rb:246:in rebuild_index’

Hi Rafe,

This may be due to a problem with the StandardAnalyzer regular
expression. It degrades exponentially on long tokens. If you must use
the pure ruby version of Ferret, try using a WhiteSpaceAnalyzer or a
LetterAnalyzer. I’d recommend using Ferret with the C extensions
whenever possible though.

Cheers,
Dave

Hi Rafe,

The C extensions should be loaded automatically. If there is an error,
Ferret 0.9.5 and earlier version will fall back an use the pure ruby
version. There won’t be any error message printed to screen since
usually this will just be a result the extension not compiling
correctly.

If you aren’t using acts_as_ferret you could try updating to the
0.10.0 gem. This will probably require a few changes as the API is not
backwards compatible. Otherwise I’d investigate why the extension
isn’t getting loaded. Check that you definately have a ferret_ext.so
(or whatever the equavalent is on the mac, I think ferret_ext.bundle)
in your ferret gem directory in the ext folder. On my system (ubuntu)
it is /usr/lib/ruby/gems/1.8/gems/ferret-0.9.5/ext/ferret_ext.so. If
this file exists try;

require "ferret_ext"

Let us know how that goes.

Cheers,
Dave

Thanks for the response. How do I get it to use the C extensions?
They were compiled when I installed the Gem, so I assumed it was
using them.

–Rafe