Re: Detecting similar strings

This should point you in the right direction I think:

http://raa.ruby-lang.org/project/levenshtein/
http://raa.ruby-lang.org/project/soundex/
http://raa.ruby-lang.org/project/metaphone/

The levenshtein algorithm basically gives you the “edit-distance”
between two strings. E.g. the minimum amount of
insertions/replacements/deletions to make the strings identical. It
gives you a pretty good indication on how similar the strings are.

Soundex transforms all strings that are similar into the same 4
character code (which looks something like “E246”).

Metaphone is preferred over soundex I believe. It also transforms
similar strings into the same character sequence, but doesn’t limit
itself to just 4 characters. That means it works a bit better with
longer strings.

There are probably other algorithms around as well, but I’ve had pretty
good luck with these three.

Regards,
Helge E.

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs