Trouble with custom Analyzer

Hi!

I wanted to build my own custom Analyzer like so:

class Analyzer < Ferret::Analysis::Analyzer

 include Ferret::Analysis

 def initialize(stop_words = ENGLISH_STOP_WORDS)
   @stop_words = stop_words
 end

 def token_stream(field, string)
   StopFilter.new(LetterTokenizer.new(string, true), @stop_words)
 end

end

As one can easily spot, I essentially want a LetterAnalyzer with stop
word filtering. However, using that analyzer (for indexing) results
in a segmentation fault.

/opt/local/lib/ruby/gems/1.8/gems/ferret-0.10.13/lib/ferret/
index.rb:281: [BUG] Segmentation fault
ruby 1.8.5 (2006-08-25) [powerpc-darwin8.8.0]

This is admittedly a rather naive implementation which is
extrapolated from those I found in the docs. So what am I missing here?

Cheers,
Andy

On 10/23/06, Andreas K. [email protected] wrote:

 end

/opt/local/lib/ruby/gems/1.8/gems/ferret-0.10.13/lib/ferret/
index.rb:281: [BUG] Segmentation fault
ruby 1.8.5 (2006-08-25) [powerpc-darwin8.8.0]

This is admittedly a rather naive implementation which is
extrapolated from those I found in the docs. So what am I missing here?

Hi Andy,

This works for me so I’ll need a little more info to solve the
problem. First, try running this:

require 'rubygems'
require 'ferret'

class Analyzer < Ferret::Analysis::Analyzer

  include Ferret::Analysis

  def initialize(stop_words = ENGLISH_STOP_WORDS)
    @stop_words = stop_words
  end

  def token_stream(field, string)
    StopFilter.new(LetterTokenizer.new(string, true), @stop_words)
  end

end

i = Ferret::I.new(:analyzer => Analyzer.new)

i << "A sentence to analyze"

puts i.search("analyze")

If that works, try and track down where in your code ferret is
seg-faulting.

Cheers,
Dave

On 23.10.2006, at 06:32, David B. wrote:

puts i.search("analyze")

If that works, try and track down where in your code ferret is seg-
faulting.

Dave,

thanks for the hint. I was using the add_document method instead of
<< to add documents to the index. Changing the above code to

i = Ferret::I.new()

i.addDocument(“A sentence to analyze”, Analyzer.new)

still works fine.

However, changing my original code to use the << method (and
specifying the Analyzer with Index.new) solves the problem. I didn’t
manage to distill a concise test case from my code to reproduce the
segfault. And hey, why bother, it works just fine now :slight_smile:

Thanks again,
Andy