Feature question

Hi,

Can ferret search for a combination of words and return the distance
between them in a text? If it exists is there a way you can improve on
this by looking if they are separated by a certain character(like . for
different sentences)?

Thanks,
Radu

Hi!

On 19.05.2008, at 22:46, Radu S. wrote:

Hi,

Can ferret search for a combination of words and return the distance
between them in a text?

It won’t directly return you the distance but given the fact that
Ferret stores term positions it should be possible to manually
determine the distance between different terms. You may also issue
phrase queries that only return hits for terms that are separated by
at most n other terms. The QueryParser API docs or the Ferret book
have examples of this.

If it exists is there a way you can improve on this by looking if
they are separated by a certain character(like . for different
sentences)?

Usually you dont index characters like ‘.’ at all (they are removed
during analysis, when the text is split up into tokens), but if you
changed that so sentence endings end up in the index as kind of
special terms this might be possible, too.

I dont know your use case, but keep in mind that you can get the
effect of ranking terms that are closer together higher by chaining
Phrase Queries with different Slop values, and assigning them
different boosts:

(“red fox”)^15 OR (“red fox”~4)^10 OR (“red fox”~10)^5 OR (“red
fox”~100)

this will boost the exact match the most, and assign lower boosts to
matches where the terms have larger distance.
Maybe something like this will already be a ‘good enough’ solution to
your problem?

cheers,
Jens


Jens Krämer
Finkenlust 14, 06449 Aschersleben, Germany
VAT Id DE251962952
http://www.jkraemer.net/ - Blog
http://www.omdb.org/ - The new free film database