Forum: Ferret Scores in ferret

F0d09ef2a279cae4ac312164aa1af6d3?d=identicon&s=25 Ian Connor (Guest)
on 2008-06-27 12:00
(Received via mailing list)
Hi,

Are scores an absolute calculation or relative to what is in a given
index? I ask because I wanted to look into distributing my index over
a few servers. The idea being that I could get 10 results for a couple
of servers, do an in memory merge and return the results faster than
it would be possible with just the one index server.

Would this work? Has anyone tried this type of ghetto map-reduce like
deployment with ferret?

--
Regards,

Ian Connor
C9dd93aa135988cabf9183d3210665ca?d=identicon&s=25 Jens Kraemer (Guest)
on 2008-06-27 14:49
(Received via mailing list)
Hi,

the scores are relative to the contents of the index, so this won't be
*that* easy.

However it is possible to have a distributed index in terms of multiple
physical indexes on the same machine (this is done by having one
IndexReader
instance using several underlying IndexReader instances), with
consistent scores.

What's missing is the possiblity to access remote indexes this way
(Lucene has this feature afair).


Cheers,
Jens

On Fri, Jun 27, 2008 at 05:59:45AM -0400, Ian Connor wrote:
>
> --
> Regards,
>
> Ian Connor
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk@rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
>

--
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
kraemer@webit.de | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold
F0d09ef2a279cae4ac312164aa1af6d3?d=identicon&s=25 Ian Connor (Guest)
on 2008-06-27 16:10
(Received via mailing list)
I am hoping to index many GB of data. It is cheaper hardware wise to
have a few machines with 8GB of RAM instead of one large machine.

Has anyone had success with large data sets? In my case the full
MEDLINE data (pubmed.gov).

My initial performance tests is to index 100k articles and it seems
10x faster when RAM is used compared with disks. I am still trying to
figure out the bottlenecks in terms of CPU/IO/etc. Once the index is
built, I am impressed with the read speeds.

On Fri, Jun 27, 2008 at 8:16 AM, Jens Kraemer <kraemer@webit.de> wrote:
> What's missing is the possiblity to access remote indexes this way
>> index? I ask because I wanted to look into distributing my index over
>> Ian Connor
> Telefon +49 351 46766-0 | Telefax +49 351 46766-66
> kraemer@webit.de | www.webit.de
>
> Amtsgericht Dresden | HRB 15422
> GF Sven Haubold
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk@rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
>



--
Regards,

Ian Connor
82 Fellsway W #2
Somerville, MA 02145
Direct Line: +1 (978) 6333372
Call Center Phone: +1 (714) 239 3875 (24 hrs)
Mobile Phone: +1 (312) 218 3209
Fax: +1(770) 818 5697
Suisse Phone: +41 (0) 22 548 1664
Skype: ian.connor
This topic is locked and can not be replied to.