Hello I have a couple of questions, Hope someone here can help answer
them.
I am using acts_as_ferret on a model Item with around 10 million rows.
I use Item.rebuild_index at the ruby console to build the index. It
seems to run for at least 48 hours when building.
My questions are:
How do you know when the indexing is over and complete?
How can you confirm that ALL records in the table were indexed?
(especially since the table runs into millions of records)
On Sun, Feb 25, 2007 at 06:20:55AM +0100, Jen wrote:
Hello I have a couple of questions, Hope someone here can help answer
them.
I am using acts_as_ferret on a model Item with around 10 million rows.
I use Item.rebuild_index at the ruby console to build the index. It
seems to run for at least 48 hours when building.
My questions are:
How do you know when the indexing is over and complete?
indexing is done when rebuild_index returns. atm there is no logging of
the progress rebuild_index already has made with a running rebuild.
However I’m thinking about adding some kind of logging now.
How can you confirm that ALL records in the table were indexed?
(especially since the table runs into millions of records)
if rebuild_index returns normally and no error is thrown, I’d say it was
successful and indexed all your records. To make sure you have all 10
million documents in the index, you can inspect the index with a small
script like that:
Thanks, Jens! I will try your suggestion. It would be nice to have the
logging thing if you plan to add it in - esp for builds that take a
loooong time
Btw is there any way to speed up the build process?
Thanks, again…
-Jen
Jens K. wrote:
Hi!
On Sun, Feb 25, 2007 at 06:20:55AM +0100, Jen wrote:
Hello I have a couple of questions, Hope someone here can help answer
them.
I am using acts_as_ferret on a model Item with around 10 million rows.
I use Item.rebuild_index at the ruby console to build the index. It
seems to run for at least 48 hours when building.
My questions are:
How do you know when the indexing is over and complete?
indexing is done when rebuild_index returns. atm there is no logging of
the progress rebuild_index already has made with a running rebuild.
However I’m thinking about adding some kind of logging now.
How can you confirm that ALL records in the table were indexed?
(especially since the table runs into millions of records)
if rebuild_index returns normally and no error is thrown, I’d say it was
successful and indexed all your records. To make sure you have all 10
million documents in the index, you can inspect the index with a small
script like that:
On Tue, Feb 27, 2007 at 06:19:00PM +0100, Jen wrote:
Thanks, Jens! I will try your suggestion. It would be nice to have the
logging thing if you plan to add it in - esp for builds that take a
loooong time
Btw is there any way to speed up the build process?
if you have enough ram you can increase the batch size used during
rebuilding (declared class_methods.rb, look for batch_size), that should
result in less database calls.
You can also limit the number of fields you index by excplicitly naming
the fields you need to search in in your call to acts_as_ferret, if you
don’t do this already.
cheers,
Jens
–
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66 [email protected] | www.webit.de
Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.