Fuzzy search

This may be offtopic to Rails, but what are people doing to find records
based on fuzzy string matches? For example, if you wanted to find a
Person with name “David Heinemeier H.” but searched using the
string “Dave Hansson”.

Currently I am find_by_sql that calls the PostgreSQL function
“levenshtein(string1, string2)” which returns results with a score
indicating how close the matches are. It is OK, but nowhere good as I
would hope. Any better suggestions?

thanks,
Jeff

Am 20.06.2006 um 00:02 schrieb Jeff C.:

Currently I am find_by_sql that calls the PostgreSQL function
“levenshtein(string1, string2)” which returns results with a score
indicating how close the matches are. It is OK, but nowhere good as I
would hope. Any better suggestions?

Try to use a full text index of the database of your choice. Another
thing you might want to try is:

“Search Word”

=> ‘text ILIKE “%SEARCH%” OR text ILIKE “%WORD%”’

Beyond that (sophisticated and prewritten or simple and slow) it
becomes a bit … harder. “Good Text Searching” (“Information
Retrieval” is the academic term which is a bit more general) is one
of the not-yet-completely-solved problems of today’s computer sciences.

Regards,

Manuel


I have found an elegant and short solution for Fermat’s Theorem
but sadly there is not enough space for it in this signature.

Hi Jeff,

you might want to look at ferret: http://ferret.davebalmain.com as well.
There is a link to the API and in the QueryParser
http://ferret.davebalmain.com/api/classes/Ferret/QueryParser.html you’ll
find an explanation how to build a FuzzyQuery. When you need superfast
search on loads of records you’ll find that sooner or later you’ll need
a
‘search specialist’ in your application anyway…

Regards
Jan