Search Ferret Index for Use With Autocomplete / Options List

I’ve got Ferret up and running for a Rails application and I’d like to
be able to use autocomplete in my text search field so that as the user
is typing each character of the search term, the index is queried for
matching terms starting with those characters, and they are displayed in
a list under the search box, like Google Suggest.

I have searched the Ferret API but I can’t find a way to do for example:

Show all words in index that start with ‘s’, then ‘sp’, then ‘spa’, etc.

Thanks for any assistance.

Gaudi Mi wrote:

I’ve got Ferret up and running for a Rails application and I’d like to
be able to use autocomplete in my text search field so that as the user
is typing each character of the search term, the index is queried for
matching terms starting with those characters, and they are displayed in
a list under the search box, like Google Suggest.

I have searched the Ferret API but I can’t find a way to do for example:

Show all words in index that start with ‘s’, then ‘sp’, then ‘spa’, etc.

Thanks for any assistance.

Is that what google does, though? I thought the autocomplete was being
filled with previous matching searches. If that is the case then you
might have to save all search queries to a db. Have you seen the jQuery
autocomplete plugin? It’s pretty good:

http://www.pengoworks.com/workshop/jquery/autocomplete.htm

Is that what google does, though?
Yes it is.

This problem is related to the problem I asked about the other day
about levenshtein distance.

Is this stuff exposed in Ferret?

-Rob

On 16.06.2008, at 13:08, Max W. wrote:

I have searched the Ferret API but I can’t find a way to do for
example:

Show all words in index that start with ‘s’, then ‘sp’, then ‘spa’,
etc.

This might be accomplished by using a TermEnum
(http://ferret.davebalmain.com/api/classes/Ferret/Index/TermEnum.html
) which basically is a list of all terms present in the index in a
given field. Using term_enum.skip_to(‘s’) should bring back the first
term starting with letter ‘s’, then get all other terms with s by
calling term_enum.next as often as necessary.

Never tried this, but it should work.

However the ‘common’ way for autocomplete is indeed to base the
completion on past searches, i.e. index user’s successful queries and
suggest matching past queries while the user is typing.

If that’s really not what you want, you could also build up a second
index containing all the terms that occur in your data, each as a
document on its own (like your own dictionary), and get suggestions
from there, with fuzzy queries if you like. This can also be used for
‘did you mean’ stuff in case the user has a typo in his query and got
no results because of that.

Cheers,
Jens


Jens Krämer
Finkenlust 14, 06449 Aschersleben, Germany
VAT Id DE251962952
http://www.jkraemer.net/ - Blog
http://www.omdb.org/ - The new free film database

Thanks! I will try this. I’m actually interested in the index entries,
not past searches.

Is it possible to search for fields that start with a word or phrase?

Lets say I have the following fields:

1 mooning
2 moon landing
3 landing on the moon

Then I would like to be able to only get

The result I would like to get is:

1 and 2 if I search for moon
only 2 if I search for moon landing or moon land
only 3 if I search for landing

Is it possible with ferret or would a simple SQL query do this better?

Cheers
Mattias

Dens’t anyone have some thoughts on this?

On Jun 16, 2008, at 5:12 AM, Jens K. wrote:

This might be accomplished by using a TermEnum

That’s how I would do it. However, you have to be careful with
Analyzers: if the text is stemmed, the suggestions will be stemmed.
The solution would be to have an unstemmed field dedicated to this
purpose.

Marvin H.
Rectangular Research
http://www.rectangular.com/

Yes looked at that but the fields I have are often short. The
constrain I’m looking for is that it has to start with the query.

Like SELECT * FROM table WHERE title LIKE “term%”

mattias

On Fri, Jun 27, 2008 at 02:23:39PM +0200, Mattias B. wrote:

Dens’t anyone have some thoughts on this?

did you have a look at
http://ferret.davebalmain.com/api/classes/Ferret/Search/Spans/SpanNearQuery.html
? Not sure but it might solve some if not all of your issues.

cheers,
Jens

Then I would like to be able to only get

The result I would like to get is:

1 and 2 if I search for moon
only 2 if I search for moon landing or moon land
only 3 if I search for landing


Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold

27 jun 2008 kl. 15.52 skrev Mattias B.:

Yes looked at that but the fields I have are often short. The
constrain I’m looking for is that it has to start with the query.

Like SELECT * FROM table WHERE title LIKE “term%”

You want to use the WildQuery alternative. That way you can use term*.

Cheers,

Henke

But that doesn’t restrict the search to the start of the field. dog*
will match both “Wild dog” and “Dog bone”

mattias

Ahh true. Interesting situation. Need to research that a bit :slight_smile:

//Henke
28 jun 2008 kl. 10.20 skrev Mattias B.:

It’s the same. Will match any word in the field that starts with the
query. Same as putting a * after the query.

I was looking at a solution where a have a special index that only
contains the first word of the original field and then do the query like

firstword:dog* and theholefield:dog*

Should only match “dog bone” and not “wild dog”.

Just feels a bit strange to have to fields here.

mattias

Found it!

Ferret :: Search :: PrefixQuery

29 jun 2008 kl. 14.44 skrev Henrik:

I think so. I have tried almost everything and the common missbehavior
I get is that I keep getting hits where the query isn’t in the start
of the field but somewhere in the middle or end.

mattias

Ahh so what you need is a whitespacestemmer?
OR I’m I still missing something :slight_smile:

//Henke

30 jun 2008 kl. 10.52 skrev Mattias B.:

Hi!

On Mon, Jun 30, 2008 at 02:32:27PM +0200, Mattias B. wrote:

I think so. I have tried almost everything and the common missbehavior I
get is that I keep getting hits where the query isn’t in the start of the
field but somewhere in the middle or end.

I dont think any of Ferret’s default queries will solve your problem,
but given that something like SpanFirstQuery exists it should also be
possible to implement a SpanFirstTermQuery…

cheers,
Jens


Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold