Re: Help with Multiple Readers, 1 Writer scenario

Neville_B · September 7, 2006, 5:59am

Thanks for your email Dave,

I’ve thought about this overnight, and I’ve got a few questions please.

When you open an IndexReader on the index it is opened up on
that particular version (or state) of the index

Would you elaborate on how Ferret manages versions please. For example,
can I have two readers open, one which accesses the old version of the
index, and the second which accesses the latest version?

So to keep searches up to date you need to close and reopen
your IndexReader every time you commit changes to the index.

I guess by reopen you mean IndexReader.new ?

I proceeded to replace my Index usage with an IndexReader and Searcher
which are closed and recreated after each IndexWriter pass, and the
result seems to be that searches are still serialised - ie, a long
running query on thread t1 “blocks” the normally very fast query on
thread t1.

Might I be seeing another point of synchonisation, or am I just
observing a characteristic of ruby threads ?

Kind Regards,

Neville

Neville_B · September 7, 2006, 8:09am

On 9/7/06, Neville B. [email protected] wrote:

Thanks for your email Dave,

I’ve thought about this overnight, and I’ve got a few questions please.

When you open an IndexReader on the index it is opened up on
that particular version (or state) of the index

Would you elaborate on how Ferret manages versions please. For example,
can I have two readers open, one which accesses the old version of the
index, and the second which accesses the latest version?

When you open an IndexReader it opens all the files that it needs to
read the index and it keeps all of the file handles. Even after the
index is updated and those files are deleted they are not actually
freed by the operating system. If you then open an IndexReader on a
later version it holds file handles to all the files needed for that
version. So the answer is yes, you can have multiple IndexReaders open
on an index at the same time, all reading different versions. Each
version of the index has an internal version number and there is an
IndexReader#latest? method to determine if the version of the index
that you are reading is the current version.

So to keep searches up to date you need to close and reopen
your IndexReader every time you commit changes to the index.

I guess by reopen you mean IndexReader.new ?

That’s correct. Don’t forget to close the old IndexReader. That
garbage collector will do this for you but IndexReaders hold a lot of
resources so it’s best to close them as soon as you no longer need
them.

I proceeded to replace my Index usage with an IndexReader and Searcher
which are closed and recreated after each IndexWriter pass, and the
result seems to be that searches are still serialised - ie, a long
running query on thread t1 “blocks” the normally very fast query on
thread t1.

Might I be seeing another point of synchonisation, or am I just
observing a characteristic of ruby threads ?

I think it’s probably a symptom of using ruby threads. I don’t think
they can swap threads in the middle of a call to a C function. It’s
unusual, however for a search to take long enough to be a problem
though. What kind of search is it? If it’s a PrefixQuery, FuzzyQuery
or WildCardQuery you’ll get much better performance on an optimized
index. If you are making heavy use of any of these queries it is the
one time I’d recommend always keeping the index in an optimized state.

cheers,
Dave