RDig and AAF playing together

I have a site with two indexes. Index A is created offline by RDig
and queried from the web via RDig (specifically,
RDig.searcher.search). Index B is managed by AAF with :remote =>
true. Simple enough. However, I need to query both indexes from RDig.
Usually this is ok, as I modified RDig to accept an array of
search_paths with an element for index A and index B.

However, when Index B is updated by AAF, RDig.searcher.search will
not “see” the changes to Index B until I restart Mongrel (or restart
script/console). If I query Index B directly through
ClassB.find_by_contents(“myfield:my_value”) I see the updated results
immediately with no restart.

I know that RDig creates a single IndexReader for class. Does the
IndexReader cache the segments files in memory?

Does anyone have any ideas?

Thanks in advance for your help!

Erik

On Sun, Jul 29, 2007 at 02:34:26PM -0400, Erik M. wrote:

ClassB.find_by_contents(“myfield:my_value”) I see the updated results
immediately with no restart.

I know that RDig creates a single IndexReader for class. Does the
IndexReader cache the segments files in memory?

Yes.

Does anyone have any ideas?

You can check if your reader still ‘sees’ the most recent version of the
index with the latest? method, and re-open it accordingly. You might
have to hack RDig a bit to allow opening a new reader, but this
shouldn’t be too hard.

Cheers,
Jens


Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

Thanks Jens, that makes sense. I started with the following addition
to RDig::Searcher

   # Returns <tt>true</tt> if RDig's IndexReader has the latest

index loaded. False otherwise.
def latest?
@ferret_searcher.reader.latest?
end

I fired up two script/console instances. In the first I called
ClassB.rebuild_index, and in the second console I called
RDig.searcher.latest? and received the following seg fault.

RDig.searcher.latest?
./script/…/config/…/config/…/vendor/gems/rdig-0.3.4/lib/rdig/
search.rb:36: [BUG] Bus Error
ruby 1.8.4 (2005-12-24) [i686-darwin8.6.1]

Did I break a Ferret rule of some kind by having a reader looking at
the version of an index that is being rebuilt?

Thanks again.

Erik

On Mon, Jul 30, 2007 at 08:25:59AM -0400, Erik M. wrote:

ClassB.rebuild_index, and in the second console I called
RDig.searcher.latest? and received the following seg fault.

RDig.searcher.latest?
./script/…/config/…/config/…/vendor/gems/rdig-0.3.4/lib/rdig/
search.rb:36: [BUG] Bus Error
ruby 1.8.4 (2005-12-24) [i686-darwin8.6.1]

Did I break a Ferret rule of some kind by having a reader looking at
the version of an index that is being rebuilt?

Yes.
An index rebuild begins with deleting the old index, which will
cause index readers that were opened on the now removed index to fail
this way. So latest? is only good to detect additions/deletions of
documents.

Jens


Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

On Mon, Jul 30, 2007 at 09:18:33AM -0400, Erik M. wrote:

lines out myself :-/

On linux I get the following:

RDig.searcher.ferret_searcher.reader.latest?
(irb):5: [BUG] Segmentation fault
ruby 1.8.4 (2005-12-24) [i386-linux]

Ah yes :slight_smile:

If your reader looks at two sub-readers for different indexes (as it
seems to do, if I got your first mail right) you’ll have to call latest?
on
each of the sub readers to get around this. I do the same in
acts_as_ferret’s MultiIndex class.

Jens


Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

If I create an IndexReader like so:

ir = IndexReader.new([index1, index2])

How can I get the “sub readers” for the two indexes? From the RDocs I
only see the ability to call ir.latest?, which results in the segfault.

Thanks again.

Erik

It’s strange, I’m actually getting the Bus Error anytime I call
latest? on RDig’s index reader. The index is no longer being rebuilt.
It’s interesting because the following lines were commented out of my
version of RDig:
# if @ferret_searcher and !@ferret_searcher.reader.latest?
# # reopen searcher
# @ferret_searcher.close
# @ferret_searcher = nil
# end
So this has obviously happened before. I must have commented these
lines out myself :-/

On linux I get the following:

RDig.searcher.ferret_searcher.reader.latest?
(irb):5: [BUG] Segmentation fault
ruby 1.8.4 (2005-12-24) [i386-linux]

Aborted

Erik

On Wed, Aug 01, 2007 at 08:45:26AM -0400, Erik M. wrote:

If I create an IndexReader like so:

ir = IndexReader.new([index1, index2])

How can I get the “sub readers” for the two indexes? From the RDocs I
only see the ability to call ir.latest?, which results in the segfault.

First create two separate readers for your indexes:
reader1 = IndexReader.new(index1)
reader2 = IndexReader.new(index2)

Then build your joint reader from them:
ir = IndexReader([reader1, reader2])

Now you can easily use
reader1.latest? && reader2.latest?

to determine if your ir instance needs some refreshing.

Jens


Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa