Forum: Ferret RDig and AAF playing together

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
6d93bda89be85bc834adeab757a1742b?d=identicon&s=25 Erik Morton (Guest)
on 2007-07-29 20:35
(Received via mailing list)
I have a site with two indexes. Index A is created offline by RDig
and queried from the web via RDig (specifically,
RDig.searcher.search). Index B is managed by AAF with :remote =>
true. Simple enough. However, I need to query both indexes from RDig.
Usually this is ok, as I modified RDig to accept an array of
search_paths with an element for index A and index B.

However, when Index B is updated by AAF, RDig.searcher.search will
not "see" the changes to Index B until I restart Mongrel (or restart
script/console). If I query Index B directly through
ClassB.find_by_contents("myfield:my_value") I see the updated results
immediately with no restart.

I know that RDig creates a single IndexReader for class. Does the
IndexReader cache the segments files in memory?

Does anyone have any ideas?

Thanks in advance for your help!

Erik
C9dd93aa135988cabf9183d3210665ca?d=identicon&s=25 Jens Kraemer (Guest)
on 2007-07-30 09:38
(Received via mailing list)
On Sun, Jul 29, 2007 at 02:34:26PM -0400, Erik Morton wrote:
> ClassB.find_by_contents("myfield:my_value") I see the updated results
> immediately with no restart.
>
> I know that RDig creates a single IndexReader for class. Does the
> IndexReader cache the segments files in memory?

Yes.

> Does anyone have any ideas?

You can check if your reader still 'sees' the most recent version of the
index with the latest? method, and re-open it accordingly. You might
have to hack RDig a bit to allow opening a new reader, but this
shouldn't be too hard.

Cheers,
Jens


--
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
kraemer@webit.de | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
6d93bda89be85bc834adeab757a1742b?d=identicon&s=25 Erik Morton (Guest)
on 2007-07-30 14:26
(Received via mailing list)
Thanks Jens, that makes sense. I started with the following addition
to RDig::Searcher

       # Returns <tt>true</tt> if RDig's IndexReader has the latest
index loaded. False otherwise.
       def latest?
         @ferret_searcher.reader.latest?
       end

I fired up two script/console instances. In the first I called
ClassB.rebuild_index, and in the second console I called
RDig.searcher.latest? and received the following seg fault.

 >> RDig.searcher.latest?
./script/../config/../config/../vendor/gems/rdig-0.3.4/lib/rdig/
search.rb:36: [BUG] Bus Error
ruby 1.8.4 (2005-12-24) [i686-darwin8.6.1]

Did I break a Ferret rule of some kind by having a reader looking at
the version of an index that is being rebuilt?

Thanks again.

Erik
C9dd93aa135988cabf9183d3210665ca?d=identicon&s=25 Jens Kraemer (Guest)
on 2007-07-30 14:58
(Received via mailing list)
On Mon, Jul 30, 2007 at 08:25:59AM -0400, Erik Morton wrote:
> ClassB.rebuild_index, and in the second console I called
> RDig.searcher.latest? and received the following seg fault.
>
>  >> RDig.searcher.latest?
> ./script/../config/../config/../vendor/gems/rdig-0.3.4/lib/rdig/
> search.rb:36: [BUG] Bus Error
> ruby 1.8.4 (2005-12-24) [i686-darwin8.6.1]
>
> Did I break a Ferret rule of some kind by having a reader looking at
> the version of an index that is being rebuilt?

Yes.
An index rebuild begins with deleting the old index, which will
cause index readers that were opened on the now removed index to fail
this way. So latest? is only good to detect additions/deletions of
documents.

Jens

--
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
kraemer@webit.de | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
6d93bda89be85bc834adeab757a1742b?d=identicon&s=25 Erik Morton (Guest)
on 2007-07-30 15:19
(Received via mailing list)
It's strange, I'm actually getting the Bus Error anytime I call
latest? on RDig's index reader. The index is no longer being rebuilt.
It's interesting because the following lines were commented out of my
version of RDig:
         # if @ferret_searcher and !@ferret_searcher.reader.latest?
         #   # reopen searcher
         #   @ferret_searcher.close
         #   @ferret_searcher = nil
         # end
So this has obviously happened before. I must have commented these
lines out myself :-/

On linux I get the following:
 >> RDig.searcher.ferret_searcher.reader.latest?
(irb):5: [BUG] Segmentation fault
ruby 1.8.4 (2005-12-24) [i386-linux]

Aborted


Erik
C9dd93aa135988cabf9183d3210665ca?d=identicon&s=25 Jens Kraemer (Guest)
on 2007-07-30 15:36
(Received via mailing list)
On Mon, Jul 30, 2007 at 09:18:33AM -0400, Erik Morton wrote:
> lines out myself :-/
>
> On linux I get the following:
>  >> RDig.searcher.ferret_searcher.reader.latest?
> (irb):5: [BUG] Segmentation fault
> ruby 1.8.4 (2005-12-24) [i386-linux]

Ah yes :-)

If your reader looks at two sub-readers for different indexes (as it
seems to do, if I got your first mail right) you'll have to call latest?
on
each of the sub readers to get around this. I do the same in
acts_as_ferret's MultiIndex class.

Jens

--
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
kraemer@webit.de | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
6d93bda89be85bc834adeab757a1742b?d=identicon&s=25 Erik Morton (Guest)
on 2007-08-01 14:46
(Received via mailing list)
If I create an IndexReader like so:

ir = IndexReader.new([index1, index2])

How can I get the "sub readers" for the two indexes? From the RDocs I
only see the ability to call ir.latest?, which results in the segfault.

Thanks again.

Erik
C9dd93aa135988cabf9183d3210665ca?d=identicon&s=25 Jens Kraemer (Guest)
on 2007-08-01 15:10
(Received via mailing list)
On Wed, Aug 01, 2007 at 08:45:26AM -0400, Erik Morton wrote:
> If I create an IndexReader like so:
>
> ir = IndexReader.new([index1, index2])
>
> How can I get the "sub readers" for the two indexes? From the RDocs I
> only see the ability to call ir.latest?, which results in the segfault.

First create two separate readers for your indexes:
reader1 = IndexReader.new(index1)
reader2 = IndexReader.new(index2)

Then build your joint reader from them:
ir = IndexReader([reader1, reader2])

Now you can easily use
reader1.latest? && reader2.latest?

to determine if your ir instance needs some refreshing.


Jens


--
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
kraemer@webit.de | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
This topic is locked and can not be replied to.