Scores in ferret

Hi,

Are scores an absolute calculation or relative to what is in a given
index? I ask because I wanted to look into distributing my index over
a few servers. The idea being that I could get 10 results for a couple
of servers, do an in memory merge and return the results faster than
it would be possible with just the one index server.

Would this work? Has anyone tried this type of ghetto map-reduce like
deployment with ferret?


Regards,

Ian C.

Hi,

the scores are relative to the contents of the index, so this won’t be
that easy.

However it is possible to have a distributed index in terms of multiple
physical indexes on the same machine (this is done by having one
IndexReader
instance using several underlying IndexReader instances), with
consistent scores.

What’s missing is the possiblity to access remote indexes this way
(Lucene has this feature afair).

Cheers,
Jens

On Fri, Jun 27, 2008 at 05:59:45AM -0400, Ian C. wrote:


Regards,

Ian C.


Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk


Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold

I am hoping to index many GB of data. It is cheaper hardware wise to
have a few machines with 8GB of RAM instead of one large machine.

Has anyone had success with large data sets? In my case the full
MEDLINE data (pubmed.gov).

My initial performance tests is to index 100k articles and it seems
10x faster when RAM is used compared with disks. I am still trying to
figure out the bottlenecks in terms of CPU/IO/etc. Once the index is
built, I am impressed with the read speeds.

On Fri, Jun 27, 2008 at 8:16 AM, Jens K. [email protected] wrote:

What’s missing is the possiblity to access remote indexes this way

index? I ask because I wanted to look into distributing my index over
Ian C.
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold


Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk


Regards,

Ian C.
82 Fellsway W #2
Somerville, MA 02145
Direct Line: +1 (978) 6333372
Call Center Phone: +1 (714) 239 3875 (24 hrs)
Mobile Phone: +1 (312) 218 3209
Fax: +1(770) 818 5697
Suisse Phone: +41 (0) 22 548 1664
Skype: ian.connor