Ferret questions

Hi,

I am just getting started with Ferret, but I have a couple of
questions. Any help is appreciated.

  1. Will the current Ferret implementation only work on one server due
    to the index file? If it will work on multiple servers, could you
    point me towards some documentation? If not, are there plans to add
    support in the future?

  2. This may not be Ferret-specific, but if I were implementing tags
    similar to del.icio.us, can Ferret help in determining associations?
    In other words, if I added a bunch of space-separated tags for each
    index entry, how could I find similar tags given one tag?

  3. This is also a tag related question… if I add a created_date
    field in the index for the tags, and I wanted to display the number of
    tags, plus the number of recently added tags, should I just hit the
    index twice? As in, once with no conditions, and once with a > some
    date condition?

Thanks,
Tom

On 12/17/05, Tom D. [email protected] wrote:

Hi,

I am just getting started with Ferret, but I have a couple of
questions. Any help is appreciated.

  1. Will the current Ferret implementation only work on one server due
    to the index file? If it will work on multiple servers, could you
    point me towards some documentation? If not, are there plans to add
    support in the future?

Currently there is no support for this. But I don’t think it would be
too hard to had. You’d have to look how Lucene handles this. Check out
the ParellelMultiSearcher and RemoteSearchable classes. Or you could
wait until I get around to it. I can’t make any promises.

  1. This may not be Ferret-specific, but if I were implementing tags
    similar to del.icio.us, can Ferret help in determining associations?
    In other words, if I added a bunch of space-separated tags for each
    index entry, how could I find similar tags given one tag?

Again, Ferret doesn’t have support for this directly but it wouldn’t
be to hard to add. Look at Lucene’s MoreLikeThis class in the contrib
section;

http://svn.apache.org/viewcvs.cgi/lucene/java/trunk/contrib/similarity/src/java/org/apache/lucene/search/similar/

Or again, you could wait until I get around to it.

  1. This is also a tag related question… if I add a created_date
    field in the index for the tags, and I wanted to display the number of
    tags, plus the number of recently added tags, should I just hit the
    index twice? As in, once with no conditions, and once with a > some
    date condition?

Yes, that sounds like the right way to go about it. If every document
in the index represents one tag you could just use the
Index::Index#size method but otherwise you’ll want to do two queries.

HTH,
Dave

Thanks Dave. I currently don’t need a multi-server index, but in the
future I hope to if my site is successful :slight_smile:

I will take a look at Lucene’s MoreLikeThis once I have a bit of time,
and if I come up with something useful before you do, I will pass it
your way.

Thanks,
Tom