Plurals and synonym lists

I want to correct spelling errors automatically. I have used search in
the past where I can pass an argument through standard search to correct
a word with up to 2 spelling errors for example or do the more Google
like “Did ya mean?”. In this case I just want to change it automatically
and search. I am not too interested in specifying the number of
characters it is out by.

What is the easiest way of doing something similar in Ferret? Would I
use fuzzy search to correct misspellings? I am guessing so, but would
this also do plurals and perhaps stemming. For example, tax would search
for taxes and taxing or ball would search for balls…

Also - anyone done any work with synonym lists. I want to be able when I
do a search on chair to also do a search on stool, seat, bench etc. I
saw something around this in a Lucene book around WordNet but have no
idea how to implement in Ferret.

As you can probably tell I am pretty new to both Lucene and Ferret.

Thanks again in advance.

On 7/11/06, BlueJay [email protected] wrote:

for taxes and taxing or ball would search for balls…

You can use Fuzzy query to do SpellChecking but it’s not ideal. I’m
thinking of extracting the vim spell check and making it available in
Ruby. It is awesome. Lightning fast and there are heaps of
dictionaries available in multiple languages.

This you can do with a Fuzzy Query. But still it’s a bit difficult.
What I’m thinking of doing is adding a did_you_mean class method to
FuzzyQuery.

suggestions = FuzzyQuery.did_you_mean(term, index_reader)

What this would do is return a ordered list of term/frequency pairs of
all terms that are similar to but more common in the index than the
original term

Also - anyone done any work with synonym lists. I want to be able when I
do a search on chair to also do a search on stool, seat, bench etc. I
saw something around this in a Lucene book around WordNet but have no
idea how to implement in Ferret.

You can do the same thing in Ferret. You basically need to write your
own analyzer. I’ll have a book out on how to do all of this
eventually.

As you can probably tell I am pretty new to both Lucene and Ferret.

A warm welcome to you then.

David B. wrote:

On 7/11/06, BlueJay [email protected] wrote:

for taxes and taxing or ball would search for balls…

You can use Fuzzy query to do SpellChecking but it’s not ideal. I’m
thinking of extracting the vim spell check and making it available in
Ruby. It is awesome. Lightning fast and there are heaps of
dictionaries available in multiple languages.

This you can do with a Fuzzy Query. But still it’s a bit difficult.
What I’m thinking of doing is adding a did_you_mean class method to
FuzzyQuery.

suggestions = FuzzyQuery.did_you_mean(term, index_reader)

What this would do is return a ordered list of term/frequency pairs of
all terms that are similar to but more common in the index than the
original term

Also - anyone done any work with synonym lists. I want to be able when I
do a search on chair to also do a search on stool, seat, bench etc. I
saw something around this in a Lucene book around WordNet but have no
idea how to implement in Ferret.

You can do the same thing in Ferret. You basically need to write your
own analyzer. I’ll have a book out on how to do all of this
eventually.

As you can probably tell I am pretty new to both Lucene and Ferret.

A warm welcome to you then.

Great - a book would be of great help and I am more than willing to
proof read it for you… especially the bits around category search,
spell checking and synonym lists. Perhaps you could even use my site as
an example site in the book! because I am sure there are hundreds or
thousands of people wanting to do the same things as me.

On 7/11/06, BlueJay [email protected] wrote:

This you can do with a Fuzzy Query. But still it’s a bit difficult.

do a search on chair to also do a search on stool, seat, bench etc. I

Great - a book would be of great help and I am more than willing to
proof read it for you… especially the bits around category search,
spell checking and synonym lists. Perhaps you could even use my site as
an example site in the book! because I am sure there are hundreds or
thousands of people wanting to do the same things as me.

I’m looking forward to seeing it in action. As for examples in my
book, we’ll have to wait and see. But you can definitely mention it
here it on the Ferret Wiki but hold back for the moment because it’s
being spammed like crazy.