Testing w/ Xapian and other off-line indexing search engines


#1

I am running into problems testing with Xapian and Sphinx, both off-line
index
updating search engines (i.e., adding/changing an indexed record puts
the re
uest into a queue with the actual work done in a cronjob). To test I
can load
the fixtures, build the index, and run tests successfully that don’t add
new
records or modify the fixtures. But I can figure out how to get the
indexes
updated between the DB modify and the search. For Xapian I am doing the
following from within the test program:

rake xapian:update_index flush=true RAILS_ENV=test

The new record doesn’t make it into the index, though it does get added
to the
acts_as_xapian table.

TIA,
Jeffrey


#2

On 9 Dec 2008, at 21:33, Jeffrey L. Taylor wrote:

don’t add new
acts_as_xapian table.
but there is no such record actually in the database, just in the
ActiveRecord
cache. This problem exists for all search engines that read the
database
directly instead of going thru Rails.

Although the record isn’t committed, other connections to the database
can see it if they ask.
With Sphinx I get it to execute “SET SESSION TRANSACTION ISOLATION
LEVEL READ UNCOMMITTED” and it can then see the non committed changes
from my tests.

Fred


#3

Quoting Jeffrey L. Taylor removed_email_address@domain.invalid:

The new record doesn’t make it into the index, though it does get added to the
acts_as_xapian table.

I think I’ve found the answer. The key thing is that the test script is
adding a record at run time. ActiveRecord does not commit these
records.
This is so it can rollback the test database to just the fixtures after
each
test. The record ID is being passed to Xapian thru acts_as_xapian
database,
but there is no such record actually in the database, just in the
ActiveRecord
cache. This problem exists for all search engines that read the
database
directly instead of going thru Rails.

For the record,
Jeffrey


#4

On 9 Dec 2008, at 23:37, Jeffrey L. Taylor wrote:

test. The record ID is being passed to Xapian thru acts_as_xapian
With Sphinx I get it to execute "SET SESSION TRANSACTION ISOLATION
turns out to be too great a restriction, i.e., an undetected because
of this
bug bites me.

I’ve only used it with mysql. The transaction isolation levels it
provides are the 4 ones defined by the ISO sql standard (at least
they claim to be). Exactly how you tell the database what isolation
level you want might vary, I don’t know.

Fred


#5

Quoting Frederick C. removed_email_address@domain.invalid:

On 9 Dec 2008, at 21:33, Jeffrey L. Taylor wrote:
[snip]

cache. This problem exists for all search engines that read the
database
directly instead of going thru Rails.

Although the record isn’t committed, other connections to the database
can see it if they ask.
With Sphinx I get it to execute “SET SESSION TRANSACTION ISOLATION
LEVEL READ UNCOMMITTED” and it can then see the non committed changes
from my tests.

Thank you. Do you know if this is an SQL standard? From the MySQL
manual it
looks like it is MySQL specific. For the moment I’ll just test the
Xapian
searching against records in the fixtures. But this is good to know it
that
turns out to be too great a restriction, i.e., an undetected because of
this
bug bites me.

Jeffrey