Score for wildcard searches

Hello All,
I have a rails app that maintains movie data index and uses
“acts_as_ferret” for search. I ran into an issue with the scoring of
wildcard searches. When I search for word “super*”, the record
containing the word “superman” is ranked above the one having just
“super”.

Is this normal or am I missing something? Any ideas on how scoring can
be controlled so that the shorter word is ranked higher? Thanks.

On Sun, Nov 19, 2006 at 10:50:44PM +0100, Sreechand Boppudi wrote:

Hello All,
I have a rails app that maintains movie data index and uses
“acts_as_ferret” for search. I ran into an issue with the scoring of
wildcard searches. When I search for word “super*”, the record
containing the word “superman” is ranked above the one having just
“super”.

Is this normal or am I missing something? Any ideas on how scoring can
be controlled so that the shorter word is ranked higher? Thanks.

there’s a function named ‘explain’ in Ferret::Index::Index which prints
out the calculation how the score of the results of a query is
calculated.

This might help to find out why your scores are the way they are, but it
requires a deep understanding of how the index works (I for myself only
understand parts of it ;-))

I think Dave once explained the output of this method in a post some
time ago. Reading Lucene in Action will definitely help in understanding
what happens, too :wink:

As a quick uneducated guess - the document with ‘superman’ might score
better because it’s overall amount of text is smaller then the text of
the document containing the word ‘super’.

In general, a hit in a smaller amount of words is considered more
relevant.
But that’s only one part of the equation, so I might well be wrong
here…

Jens


webit! Gesellschaft für neue Medien mbH www.webit.de
Dipl.-Wirtschaftsingenieur Jens Krämer [email protected]
Schnorrstraße 76 Tel +49 351 46766 0
D-01069 Dresden Fax +49 351 46766 66