Hello,
First the specs:
ruby 1.8.6, rails 1.2.3, ferret 0.11.4, mongrel 1.0.1, mongrel_cluster
0.2.1
And the model's aaf config:
# Ferret search engine
acts_as_ferret({:fields => {:name => {:boost => 10},
:summary => {:boost => 2},
:body => {:boost => 1, :store =>
:no},
:question => {:boost => 1, :store =>
:no},
:answer => {:boost => 1, :store =>
:no},
:status_id => {:boost => 1},
:category_id => {:boost => 1},
:published => {:boost => 1}},
:store_class_name => true,
:remote => true,
:ferret => { :analyzer => Ferret::Analysis::
StandardAnalyzer.new([]) }
} )
I've been wrestling with a very strange intermittent problem for a few
days
now. I'm running 3 mongrels behind an Apache 2.2 proxy balancer, and
the
aaf DRb server in production mode. I've been manually load testing a
search
where the query string contains a string, and also filter by fields
category_id and status_id. I change the category or status and
re-submit
the search several times.
After some random number of searches, if I send a search that should
return
some number of results, it will return no results on 1 or 2 of the
mongrels,
but the correct results on the remaining mongrel(s). So it's like this:
refresh - no results, refresh - no results, refresh - 2 results, over &
over
in that pattern. Once this state has occurred, if I restart the DRb
server,
all will be well until I rapidly submit several more queries enough
times to
make this happen again.
I have tested this both on our RedHat-based staging server and my local
OSX
dev machine with the same exact config, and I get the same results.
Through
logging I have confirmed that the same query is received by all the
mongrels. To factor out browser-related issues, I've also tested by
hitting
the web server with a little looping ruby script, and it also breaks
that
way.
I'm not sure where to debug from here -- can anyone point me in the
right
direction? Since a restart of the DRb server fixes the problem for a
bit,
I'm thinking it may be aaf related, but I don't know how to isolate this
any
further so as to determine the exact cause.
Thanks,
Doug
on 02.07.2007 20:14
on 03.07.2007 09:23
On Mon, Jul 02, 2007 at 11:13:50AM -0700, Doug Smith wrote: [..] > > After some random number of searches, if I send a search that should return > some number of results, it will return no results on 1 or 2 of the mongrels, > but the correct results on the remaining mongrel(s). So it's like this: > refresh - no results, refresh - no results, refresh - 2 results, over & over > in that pattern. Once this state has occurred, if I restart the DRb server, > all will be well until I rapidly submit several more queries enough times to > make this happen again. [..] > > I'm not sure where to debug from here -- can anyone point me in the right > direction? Since a restart of the DRb server fixes the problem for a bit, > I'm thinking it may be aaf related, but I don't know how to isolate this any > further so as to determine the exact cause. Strange. The next step would be to find out if the query reaches the DRb server all the time, or if it gets lost somewhere in between. You should see the DRb method calls in ferret_server.log . Jens -- Jens Krämer webit! Gesellschaft für neue Medien mbH Schnorrstraße 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer@webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa
on 03.07.2007 18:00
Hi Jens, I really appreciate your excellent support of this plugin. On 7/3/07, Jens Kraemer <kraemer@webit.de> wrote: > [...] > Strange. The next step would be to find out if the query reaches the DRb > server all the time, or if it gets lost somewhere in between. You should > see the DRb method calls in ferret_server.log . Yes, the query does always reach the DRb server. Here is a portion of the ferret_server.log with my comments prefixed with ###DAS: ###DAS: Works fine #method_missing(:find_id_by_contents, ["Article", "a* category_id:17 status_id:1", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "a* category_id:17 status_id:1", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "a* category_id:17", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "a* category_id:17", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "a* category_id:17", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "a* category_id:17", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "* category_id:17", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "* category_id:17", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "* category_id:17", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "* category_id:17", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "* category_id:17", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "* category_id:17", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "* category_id:17", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "* category_id:17", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "* category_id:17 status_id:100", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "* category_id:17 status_id:100", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "* category_id:17 status_id:100", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "* category_id:17 status_id:100", {:limit=>10, :offset=>0}]) ### DAS: This query returns no results, correctly #method_missing(:find_id_by_contents, ["Article", "* category_id:17 status_id:101", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "* category_id:17 status_id:101", {:limit=>10, :offset=>0}]) ### DAS: Begins cycle -- first query returns 2 results correctly, next two return nothing (3 mongrels) #method_missing(:find_id_by_contents, ["Article", "* category_id:17", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "* category_id:17", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "* category_id:17", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "* category_id:17", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "* category_id:17", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "* category_id:17", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "* category_id:17", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "* category_id:17", {:limit=>10, :offset=>0}]) #method_missing(:find_id_by_contents, ["Article", "* category_id:17", {:limit=>10, :offset=>0}]) It does seem to only fail when doing a complete wildcard search, filtered by a field, after switching back & forth on another field a time or two. Very weird. Thanks again, Doug
on 03.07.2007 18:21
On Tue, Jul 03, 2007 at 08:59:45AM -0700, Doug Smith wrote: > Hi Jens, > > I really appreciate your excellent support of this plugin. :-) [..] > It does seem to only fail when doing a complete wildcard search, > filtered by a field, after switching back & forth on another field a > time or two. You could try that failing sequence of queries without the DRb server, and/or with only one mongrel instance or from the console, or even with a plain Ferret::I instance to find out if aaf's DRb stuff, aaf itself, or Ferret itself cause that, and if it's multi-threading-related. > Very weird. Indeed. Jens -- Jens Krämer webit! Gesellschaft für neue Medien mbH Schnorrstraße 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer@webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa
on 03.07.2007 20:03
On 7/3/07, Jens Kraemer <kraemer@webit.de> wrote: > On Tue, Jul 03, 2007 at 08:59:45AM -0700, Doug Smith wrote: > > It does seem to only fail when doing a complete wildcard search, > > filtered by a field, after switching back & forth on another field a > > time or two. > > You could try that failing sequence of queries without the DRb server, > and/or with only one mongrel instance or from the console, or even with > a plain Ferret::I instance to find out if aaf's DRb stuff, aaf itself, > or Ferret itself cause that, and if it's multi-threading-related. Ok, I wrote a test script, and between every test I stopped apache, mongrel, ferret, and deleted the indexes to make sure each one was clean. Here are the results: With DRb, 3 mongrels => fails With DRb, 2 mongrels => fails With DRb, 1 mongrel => passes Without DRb, 3 mongrels => passes So, it's the combination of DRb and multiple mongrels that always fails. Let me know how else I can test to help debug this. Thanks, Doug
on 03.07.2007 20:35
On 7/3/07, Doug Smith <doug.arogos@gmail.com> wrote: > Ok, I wrote a test script, and between every test I stopped apache, > mongrel, ferret, and deleted the indexes to make sure each one was > clean. Here are the results: > > With DRb, 3 mongrels => fails > With DRb, 2 mongrels => fails > With DRb, 1 mongrel => passes > Without DRb, 3 mongrels => passes And one more factor: If I change this line in my controller from this: results = Article.find_by_contents(q, options) to this: results = Article.multi_search(q, [], options) It never fails, always passes under all conditions. The reason I found this is that a previous project that is in production won't fail, but it uses multi_search instead of find_by_contents like I'm using now. Thanks, Doug
on 04.07.2007 09:42
On Tue, Jul 03, 2007 at 11:34:47AM -0700, Doug Smith wrote: > And one more factor: > > If I change this line in my controller from this: > > results = Article.find_by_contents(q, options) > > to this: > > results = Article.multi_search(q, [], options) > > It never fails, always passes under all conditions. ok, I guess now it's up to me to find the bug ;-) Using multi_search as a workaround should have no side effects. I filed a ticket so I won't forget - http://projects.jkraemer.net/acts_as_ferret/ticket/151 Jens -- Jens Krämer webit! Gesellschaft für neue Medien mbH Schnorrstraße 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer@webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa
on 20.09.2007 03:41
I am experiencing the same problem. Seems like some sort of caching race condition. Thanks for looking into it. My query looks like this: +listing_type:sub +(+lexo_geo_lon:[05760173 05795933] +lexo_geo_lat:[13751926 13774212]) +lexo_price:[000000000 000000650] No wildcards or even terms in this search It's very hard to reproduce if the query changes slightly, which makes me think it is a caching issue.
on 20.09.2007 03:46
Side effect of multi_search is that :store_class_names must be set, which may require re-indexing.