I am seeing trouble with searches for ‘you’ not returning anything. It
appears that ‘you’ is a stop word to the standard analyzer:
require 'rubygems'
require 'ferret'
index = Ferret::I.new(:or_default => false)
index << 'you'
puts index.search('you')
returns no hits.
I assumed from the docs that StandardAnalyzer was using stop words
as defined by:
Ferret::Analysis::ENGLISH_STOP_WORDS
but when I print that to the console I get:
[“a”, “an”, “and”, “are”, “as”, “at”, “be”, “but”, “by”, “for”, “if”,
“in”,
“into”, “is”, “it”, “no”, “not”, “of”, “on”, “or”, “s”, “such”, “t”,
“that”,
“the”, “their”, “then”, “there”, “these”, “they”, “this”, “to”,
“was”,
“will”, “with”]
I don’t see ‘you’ in there.
Supplying my own stop words seems to fix the problem:
STOP_WORDS = [“a”, “the”, “and”, “or”]
index = Ferret::I.new(:or_default => false, :analyzer =>
Ferret::Analysis::StandardAnalyzer.new(STOP_WORDS))
index << ‘you’
puts index.search(‘you’)
this returns a hit.
I am running the latest Windows build, but I’ve seen the same behavior
on Linux with the latest builds. I am happy with my solution, but it
seems odd that ‘you’ should be standard stop word.