On Wed, Aug 23, 2006 at 03:30:46AM +0900, David B. wrote:
On 8/22/06, Benjamin K. [email protected] wrote:
Lets suppose I index a User on the phrase “Ruby on Rails.” If I then
search using User.find_by_contents(“Ruby on Rails”) I get no results,
since “or” is a common term and does not get indexed. Of course,
User.find_by_contents(“Ruby R.”) works just fine.
[…]
This shouldn’t be necessary. What Jens said is correct. If you use the
same analyzer in your indexer as you use in your query parser then a
search for “Ruby on Rails” should work. If you use the Index::Index
class this will be handled for you.
As this problem seems to be fairly common recently, I did some tests and
I think I found a common pattern that seems to lead to wrong query
analyzing when using the Index::Index class:
def test_stopwords
i = Ferret::Index::Index.new(
:occur_default =>
Ferret::Search::BooleanClause::Occur::MUST,
:default_search_field => ‘*’)
d = Ferret::Document::Document.new
# adding this additional field to the document leads to failure
below
# comment out this statement and all tests pass:
d << Ferret::Document::Field.new(‘id’, ‘1’,
Ferret::Document::Field::Store::YES,
Ferret::Document::Field::Index::UNTOKENIZED)
d << Ferret::Document::Field.new('content', 'Move or shake',
Ferret::Document::Field::Store::NO,
Ferret::Document::Field::Index::TOKENIZED,
Ferret::Document::Field::TermVector::NO,
false, 1.0)
i << d
hits = i.search 'move nothere shake'
assert_equal 0, hits.size
hits = i.search 'move shake'
assert_equal 1, hits.size
hits = i.search 'move or shake'
assert_equal 1, hits.size # fails when id field is present
end
the id field is constructed just like we do it in aaf. I tried some
variations of the way the field is constructed (another name, other
flags), but as soon as there is more than one field, the test doesn’t
work any more.
Setting the default_search_field to ‘content’ makes the tests pass, btw.
Dave, any suggestions ?
Jens
–
webit! Gesellschaft für neue Medien mbH www.webit.de
Dipl.-Wirtschaftsingenieur Jens Krämer [email protected]
Schnorrstraße 76 Tel +49 351 46766 0
D-01069 Dresden Fax +49 351 46766 66