Imdex.update is 10 times slower than index.add_doc. Normal?


#1

Hi,

I am seeing that

doc = index[‘mykey’]
index.update ‘mykey’, doc

is about 10 times slower than
doc = Document.new
doc[‘id’] = ‘mykey’
index << doc

It looks like #update is much slower that #<<. Is it as expected?

Sergei.


#2

On 5/20/06, Sergei S. removed_email_address@domain.invalid wrote:

index << doc

It looks like #update is much slower that #<<. Is it as expected?

Hi Sergei,

Yes, it is expected. When you use update it has to lookup the
document with the same id. It then checks each field in the document
to see which fields have been changed and updates then. It deletes the
old document and adds the new one. This also means that it has to
open an IndexReader and then closes it and opens an IndexWriter. This
is a lot of processing. If you want fast update then you need to do it
yourself. Just adding a document doesn’t even open and IndexReader so
it is going to be faster then updating a document no matter how you do
it. The fastest way to update documents is in a batch. So if you want
to update 10 documents., delete all 10 together, then add the 10
updated documents together.

Hope that helps,
Dave