Forum: Ferret Finding related items (like latent semantic indexing)

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
57a175cd9602613be71bff2912b2b3f8?d=identicon&s=25 Chris Roos (chrisjroos)
on 2006-02-09 12:13
I've been trying to use Classifier::LSI to provide a means of finding
'related items', where each item is a one line description of a product.

Although on small samples the Classifier works great, it completely
baulks on my current dataset of 3000 items.

I've started to look at ferret this morning, following a post on the
ruby mailing list.  I'd guess that the Fuzzy Query would be the thing
that I need, although it doesn't appear to be as comprehensive as the
LSI stuff in classifier (I realise they are doing different things).

I'm really just after any thoughts anyone might have..

Thanks in advance,

Chris
B5e329ffa0cc78efbfc7ae2d084c149f?d=identicon&s=25 David Balmain (Guest)
on 2006-02-09 13:42
(Received via mailing list)
Hi Chris,

I plan on adding a "More Like This" function to Ferret but I'm really
swamped (doing other stuff on Ferret) at the moment. If you want to
have a go at implementing it yourself you could have a look at the way
it's done in Lucene. It's not too much work but it could take you a
while to get your head around the Ferret internals and the current
Ferret codebase is soon to be obselete. Sorry I can't be of more help.

Cheers,
Dave
B5e329ffa0cc78efbfc7ae2d084c149f?d=identicon&s=25 David Balmain (Guest)
on 2006-02-09 16:10
(Received via mailing list)
Hi Chris,

I just noticed that you are indexing one line product descriptions.
What I'd suggest doing (I believe this is how the lucene MoreLikeThis
query works) is just taking the description of your start product and
using that as the query. So if the description is;

    "apple ipod nano 4Gb black"

then your query will be;

    "description:(apple ipod nano 4Gb black)"

Hope that helps,
Dave
This topic is locked and can not be replied to.