My rebuild is still ending prematurely with only half of the data
indexed.
ok, so what exactly do you do to rebuild your index, and what searches
do you run to check for completeness of your indexes ?
ruby script/console production
Person.rebuild_index
Organisation.rebulid_index
Document.rebuild_index
Then I try a find_by_contents on some of the people (using any of the
fields i.e. surname). I can find people up to about two thirds of the
way through the data (in id order) but the final third aren’t found.
If I edit, say, the last record (which previously could not be found
using a search) then I can see in the log that the edited data is added
to the index then when I search for it it is found.
So for some reason it looks as though a large part of the data isn’t
being added to the index.
On Wed, Nov 22, 2006 at 12:10:45AM +0100, Matthew Planchant wrote:
Then I try a find_by_contents on some of the people (using any of the
fields i.e. surname). I can find people up to about two thirds of the
way through the data (in id order) but the final third aren’t found.
ok, to keep things simple please keep trying with only the Person class
for now. what does the log look like if you do Person.rebuild_index ?
What happens if you do the same (with the same data) in development
mode (maybe on another machine) ?
cheers,
Jens
–
webit! Gesellschaft für neue Medien mbH www.webit.de
Dipl.-Wirtschaftsingenieur Jens Krämer [email protected]
Schnorrstraße 76 Tel +49 351 46766 0
D-01069 Dresden Fax +49 351 46766 66
OK. Here is what is happening. When the indexing starts it selects the
first 1000 records to add to the index. These seem to be added to the
index. When it has added the 1000th record another select appears in the
log file to get the rest of records (There are 1561 records in the
table):
SELECT * FROM (SELECT TOP 561 * FROM (SELECT TOP 1561 * FROM persons) AS
tmp1 ) AS tmp2e
However this select doesn’t get the records from 1001 to 1561. It get 1
to 1000. So these first 500 or so records are added to the index twice
but the final 500 are never added.
On Wed, Nov 22, 2006 at 12:10:45AM +0100, Matthew Planchant wrote:
Then I try a find_by_contents on some of the people (using any of the
fields i.e. surname). I can find people up to about two thirds of the
way through the data (in id order) but the final third aren’t found.
ok, to keep things simple please keep trying with only the Person class
for now. what does the log look like if you do Person.rebuild_index ?
Good idea. I’ll give this a go. I’ll try:
class Person < ActiveRecord::Base
acts_as_ferret
end
What happens if you do the same (with the same data) in development
mode (maybe on another machine) ?
However this select doesn’t get the records from 1001 to 1561. It get 1
to 1000. So these first 500 or so records are added to the index twice
but the final 500 are never added.
Small mistake here I should read:
However this select doesn’t get the records from 1001 to 1561. It gets 1
to 561. So these first 500 or so records are added to the index twice
but the final 500 are never added.
I have a model with only ~350 records this seems to have been added as
it should have been. I can find all the records from searching. The
problem seems to occur when there are more then 1000 records (the
batch_size in rebuild_index).
On Thu, Nov 23, 2006 at 11:55:12AM +0100, Matthew Planchant wrote:
I have a model with only ~350 records this seems to have been added as
it should have been. I can find all the records from searching. The
problem seems to occur when there are more then 1000 records (the
batch_size in rebuild_index).
so does setting the batch size to a higher value, say 10000, work for
you ?
Jens
–
webit! Gesellschaft für neue Medien mbH www.webit.de
Dipl.-Wirtschaftsingenieur Jens Krämer [email protected]
Schnorrstraße 76 Tel +49 351 46766 0
D-01069 Dresden Fax +49 351 46766 66
On Thu, Nov 23, 2006 at 11:28:45AM +0100, Matthew Planchant wrote:
OK. Here is what is happening. When the indexing starts it selects the
first 1000 records to add to the index. These seem to be added to the
index. When it has added the 1000th record another select appears in the
log file to get the rest of records (There are 1561 records in the
table):
SELECT * FROM (SELECT TOP 561 * FROM (SELECT TOP 1561 * FROM persons) AS
tmp1 ) AS tmp2e
ehm, what kind of database is this ? looks really strange
However this select doesn’t get the records from 1001 to 1561. It get 1
to 1000. So these first 500 or so records are added to the index twice
but the final 500 are never added.
is it possible the :limit and :offset options of ActiveRecord are not
supported or buggy for your kind of database ?
what do you get when calling
Person.find(:all, :limit => 1000, :offset => 1000)
on the console ?
Jens
–
webit! Gesellschaft für neue Medien mbH www.webit.de
Dipl.-Wirtschaftsingenieur Jens Krämer [email protected]
Schnorrstraße 76 Tel +49 351 46766 0
D-01069 Dresden Fax +49 351 46766 66
You mean if I reverse the patch and go back to not using batches? I
don’t know I haven’t tried that yet but I assume that it will as it’s
the SQl which is generated to create the batches which doesn’t work with
MS SQL Server.
For the moment I’ve set the batch size to 5000 (i.e. greater than then
number of records and have in any of my models) and it works.