Strange intermittent no search results problem

Hello,

First the specs:
ruby 1.8.6, rails 1.2.3, ferret 0.11.4, mongrel 1.0.1, mongrel_cluster
0.2.1

And the model’s aaf config:

Ferret search engine

acts_as_ferret({:fields => {:name => {:boost => 10},
:summary => {:boost => 2},
:body => {:boost => 1, :store =>
:no},
:question => {:boost => 1, :store =>
:no},
:answer => {:boost => 1, :store =>
:no},
:status_id => {:boost => 1},
:category_id => {:boost => 1},
:published => {:boost => 1}},
:store_class_name => true,
:remote => true,
:ferret => { :analyzer => Ferret::Analysis::
StandardAnalyzer.new([]) }
} )

I’ve been wrestling with a very strange intermittent problem for a few
days
now. I’m running 3 mongrels behind an Apache 2.2 proxy balancer, and
the
aaf DRb server in production mode. I’ve been manually load testing a
search
where the query string contains a string, and also filter by fields
category_id and status_id. I change the category or status and
re-submit
the search several times.

After some random number of searches, if I send a search that should
return
some number of results, it will return no results on 1 or 2 of the
mongrels,
but the correct results on the remaining mongrel(s). So it’s like this:
refresh - no results, refresh - no results, refresh - 2 results, over &
over
in that pattern. Once this state has occurred, if I restart the DRb
server,
all will be well until I rapidly submit several more queries enough
times to
make this happen again.

I have tested this both on our RedHat-based staging server and my local
OSX
dev machine with the same exact config, and I get the same results.
Through
logging I have confirmed that the same query is received by all the
mongrels. To factor out browser-related issues, I’ve also tested by
hitting
the web server with a little looping ruby script, and it also breaks
that
way.

I’m not sure where to debug from here – can anyone point me in the
right
direction? Since a restart of the DRb server fixes the problem for a
bit,
I’m thinking it may be aaf related, but I don’t know how to isolate this
any
further so as to determine the exact cause.

Thanks,

Doug

On Mon, Jul 02, 2007 at 11:13:50AM -0700, Doug S. wrote:
[…]

After some random number of searches, if I send a search that should return
some number of results, it will return no results on 1 or 2 of the mongrels,
but the correct results on the remaining mongrel(s). So it’s like this:
refresh - no results, refresh - no results, refresh - 2 results, over & over
in that pattern. Once this state has occurred, if I restart the DRb server,
all will be well until I rapidly submit several more queries enough times to
make this happen again.
[…]

I’m not sure where to debug from here – can anyone point me in the right
direction? Since a restart of the DRb server fixes the problem for a bit,
I’m thinking it may be aaf related, but I don’t know how to isolate this any
further so as to determine the exact cause.

Strange. The next step would be to find out if the query reaches the DRb
server all the time, or if it gets lost somewhere in between. You should
see the DRb method calls in ferret_server.log .

Jens


Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

Hi Jens,

I really appreciate your excellent support of this plugin.

On 7/3/07, Jens K. [email protected] wrote:

[…]
Strange. The next step would be to find out if the query reaches the DRb
server all the time, or if it gets lost somewhere in between. You should
see the DRb method calls in ferret_server.log .

Yes, the query does always reach the DRb server. Here is a portion of
the ferret_server.log with my comments prefixed with ###DAS:

###DAS: Works fine
#method_missing(:find_id_by_contents, [“Article”, “a* category_id:17
status_id:1”, {:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “a* category_id:17
status_id:1”, {:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “a* category_id:17”,
{:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “a* category_id:17”,
{:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “a* category_id:17”,
{:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “a* category_id:17”,
{:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17”,
{:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17”,
{:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17”,
{:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17”,
{:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17”,
{:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17”,
{:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17”,
{:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17”,
{:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17
status_id:100”, {:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17
status_id:100”, {:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17
status_id:100”, {:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17
status_id:100”, {:limit=>10, :offset=>0}])

DAS: This query returns no results, correctly

#method_missing(:find_id_by_contents, [“Article”, “* category_id:17
status_id:101”, {:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17
status_id:101”, {:limit=>10, :offset=>0}])

DAS: Begins cycle – first query returns 2 results correctly, next

two return nothing (3 mongrels)
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17”,
{:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17”,
{:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17”,
{:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17”,
{:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17”,
{:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17”,
{:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17”,
{:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17”,
{:limit=>10, :offset=>0}])
#method_missing(:find_id_by_contents, [“Article”, “* category_id:17”,
{:limit=>10, :offset=>0}])

It does seem to only fail when doing a complete wildcard search,
filtered by a field, after switching back & forth on another field a
time or two.

Very weird.

Thanks again,

Doug

On Tue, Jul 03, 2007 at 08:59:45AM -0700, Doug S. wrote:

Hi Jens,

I really appreciate your excellent support of this plugin.

:slight_smile:

[…]

It does seem to only fail when doing a complete wildcard search,
filtered by a field, after switching back & forth on another field a
time or two.

You could try that failing sequence of queries without the DRb server,
and/or with only one mongrel instance or from the console, or even with
a plain Ferret::I instance to find out if aaf’s DRb stuff, aaf itself,
or Ferret itself cause that, and if it’s multi-threading-related.

Very weird.

Indeed.

Jens


Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

On 7/3/07, Doug S. [email protected] wrote:

Ok, I wrote a test script, and between every test I stopped apache,
mongrel, ferret, and deleted the indexes to make sure each one was
clean. Here are the results:

With DRb, 3 mongrels => fails
With DRb, 2 mongrels => fails
With DRb, 1 mongrel => passes
Without DRb, 3 mongrels => passes

And one more factor:

If I change this line in my controller from this:

  results = Article.find_by_contents(q, options)

to this:

  results = Article.multi_search(q, [], options)

It never fails, always passes under all conditions.

The reason I found this is that a previous project that is in
production won’t fail, but it uses multi_search instead of
find_by_contents like I’m using now.

Thanks,

Doug

On 7/3/07, Jens K. [email protected] wrote:

On Tue, Jul 03, 2007 at 08:59:45AM -0700, Doug S. wrote:

It does seem to only fail when doing a complete wildcard search,
filtered by a field, after switching back & forth on another field a
time or two.

You could try that failing sequence of queries without the DRb server,
and/or with only one mongrel instance or from the console, or even with
a plain Ferret::I instance to find out if aaf’s DRb stuff, aaf itself,
or Ferret itself cause that, and if it’s multi-threading-related.

Ok, I wrote a test script, and between every test I stopped apache,
mongrel, ferret, and deleted the indexes to make sure each one was
clean. Here are the results:

With DRb, 3 mongrels => fails
With DRb, 2 mongrels => fails
With DRb, 1 mongrel => passes
Without DRb, 3 mongrels => passes

So, it’s the combination of DRb and multiple mongrels that always
fails. Let me know how else I can test to help debug this.

Thanks,

Doug

I am experiencing the same problem.

Seems like some sort of caching race condition.

Thanks for looking into it.

My query looks like this:

+listing_type:sub +(+lexo_geo_lon:[05760173 05795933]
+lexo_geo_lat:[13751926 13774212]) +lexo_price:[000000000 000000650]

No wildcards or even terms in this search

It’s very hard to reproduce if the query changes slightly, which makes
me think it is a caching issue.

Side effect of multi_search is that :store_class_names must be set,
which may require re-indexing.

On Tue, Jul 03, 2007 at 11:34:47AM -0700, Doug S. wrote:

And one more factor:

If I change this line in my controller from this:

  results = Article.find_by_contents(q, options)

to this:

  results = Article.multi_search(q, [], options)

It never fails, always passes under all conditions.

ok, I guess now it’s up to me to find the bug :wink:
Using multi_search as a workaround should have no side effects.
I filed a ticket so I won’t forget -
http://projects.jkraemer.net/acts_as_ferret/ticket/151

Jens


Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa