Multiple-index searching with merged results

Hey…

i am just browsing through the lucene features and i’m wondering if this
feature is available in ferret as well …

multiple-index searching with merged results

this would be nice, as i’m thinking about several indexes, as i am using
a
lot of wildcard queries for livesearches like google suggest. i think
the
performance would increase, if i split my rather big index in smaller
ones.

but the main search should be able to search all indexes.

this is just an option that i was thinking about, not a must have… i’m
just wondering if this is already possible.

Ben

On Mon, Aug 21, 2006 at 10:59:48AM +0200, Benjamin K. wrote:

Hey…

i am just browsing through the lucene features and i’m wondering if this
feature is available in ferret as well …

multiple-index searching with merged results

yes, it is. Checkout the MultiSearcher class. it behaves like
IndexSearcher, but searches multiple indexes:

s = MultiSearcher.new([
IndexSearcher.new(index_dir_one),
IndexSearcher.new(another_index_dir)
])
hits = s.search(TermQuery.new(Term.new(“title”,“title”)))

Jens


webit! Gesellschaft für neue Medien mbH www.webit.de
Dipl.-Wirtschaftsingenieur Jens Krämer [email protected]
Schnorrstraße 76 Tel +49 351 46766 0
D-01069 Dresden Fax +49 351 46766 66

On Tue, Aug 22, 2006 at 09:38:57AM +0200, Benjamin K. wrote:

Checkout the MultiSearcher class. it behaves like

thanks! … i guess you mean the Ferret::Index::MultiReader? can’t find a
multisearcher in the api …

don’t know why it isn’t in the api, but the MultiSearcher class does
exist.
besides the constructor, the interface is the same as in IndexSearcher.

Probably MultiReader could be used to search across multiple indexes,
too,
but this would be more low-level then.

Jens


webit! Gesellschaft für neue Medien mbH www.webit.de
Dipl.-Wirtschaftsingenieur Jens Krämer [email protected]
Schnorrstraße 76 Tel +49 351 46766 0
D-01069 Dresden Fax +49 351 46766 66

Checkout the MultiSearcher class. it behaves like

thanks! … i guess you mean the Ferret::Index::MultiReader? can’t find a
multisearcher in the api …

Ben

On 8/22/06, Jens K. [email protected] wrote:

On Tue, Aug 22, 2006 at 09:38:57AM +0200, Benjamin K. wrote:

Checkout the MultiSearcher class. it behaves like

thanks! … i guess you mean the Ferret::Index::MultiReader? can’t find a
multisearcher in the api …

don’t know why it isn’t in the api, but the MultiSearcher class does exist.
besides the constructor, the interface is the same as in IndexSearcher.

It wasn’t in the API because I’ve been too busy/lazy to update the
documentation for a while. The documentation for Ferret 0.10.0 is
up-to-date and I will try and keep it that way.

Probably MultiReader could be used to search across multiple indexes, too,
but this would be more low-level then.

MultiReader is now preferrable to using MultiSearcher. There should be
a large speed difference and sorting and scoring in the MultiSearcher
is a bit flaky. I’d like to remove MultiSearcher as it’s
implementation is very difficult but it’ll come in handy if I ever get
around to implementing a remote searcher so multiple machines can be
searched at once. Not a priority in the foreseeable future though.

Cheers,
Dave

Hey…

It wasn’t in the API because I’ve been too busy/lazy to update the
documentation for a while. The documentation for Ferret 0.10.0 is
up-to-date and I will try and keep it that way.

i’ve installed a svn-script to automatically build and deploy the API as
soon as someone post a check-in … this is based on rdoc, not sure if
this will work for your c-extension as well, but i can send you the 2 or
3
lines :slight_smile:

Ben

On 8/23/06, Benjamin K. [email protected] wrote:

Hey…

It wasn’t in the API because I’ve been too busy/lazy to update the
documentation for a while. The documentation for Ferret 0.10.0 is
up-to-date and I will try and keep it that way.

i’ve installed a svn-script to automatically build and deploy the API as
soon as someone post a check-in … this is based on rdoc, not sure if
this will work for your c-extension as well, but i can send you the 2 or 3
lines :slight_smile:

Hey Ben,

The C-extensions use rdoc too. The easiest way to build the
documentation from an svn checkout is calling “rake doc”. Also, I’m
not sure if you already knew this, but I still need to merge Ferret
0.10.0 into the ferret/trunk repository. It is currently in
svn://www.davebalmain.com/exp.

Cheers
Dave