[AAF] remote indexing via DRb with acts_as_ferret

Hi!

Aaf trunk has undergone several major refactorings the last days, with
the result that you can now transparently switch your app from local
to remote indexing and back :slight_smile:

If you plan to scale your app to more than one physical machine, or
if you have problems with corrupted indexes and the like under high
load, you really should give this a try.

I wrote some documentation to get you started with the remote indexing
stuff at
http://projects.jkraemer.net/acts_as_ferret/wiki/DrbServer

Looking forward to your feedback,
Jens

–
webit! Gesellschaft für neue Medien mbH www.webit.de
Dipl.-Wirtschaftsingenieur Jens Krämer [email protected]
Schnorrstraße 76 Tel +49 351 46766 0
D-01069 Dresden Fax +49 351 46766 66

On 2/4/07, Jens K. [email protected] wrote:

I wrote some documentation to get you started with the remote indexing
stuff at
http://projects.jkraemer.net/acts_as_ferret/wiki/DrbServer

FWIW, I’m running this in production with about 5 updates/sec and 20-30
searches/second without problems.

Awesome!

-ryan

On Sat, Feb 24, 2007 at 01:09:42AM -0800, Ryan K. wrote:

On 2/4/07, Jens K. [email protected] wrote:
[…]

I wrote some documentation to get you started with the remote indexing
stuff at
http://projects.jkraemer.net/acts_as_ferret/wiki/DrbServer

FWIW, I’m running this in production with about 5 updates/sec and 20-30
searches/second without problems.

cool :slight_smile:
what kind of app is this?

cheers,
Jens

–
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

Jens K. wrote:

Hi!

Aaf trunk has undergone several major refactorings the last days, with
the result that you can now transparently switch your app from local
to remote indexing and back :slight_smile:

If you plan to scale your app to more than one physical machine, or
if you have problems with corrupted indexes and the like under high
load, you really should give this a try.

I wrote some documentation to get you started with the remote indexing
stuff at
http://projects.jkraemer.net/acts_as_ferret/wiki/DrbServer

Looking forward to your feedback,
Jens

Hi, thanks for the great work! Does this work with Rails 1.1.6 as I
haven’t made the switch to 1.2.1 yet?

On Sun, Mar 04, 2007 at 08:59:26AM +0100, donut donut wrote:

Jens K. wrote:
[…]

I wrote some documentation to get you started with the remote indexing
stuff at
http://projects.jkraemer.net/acts_as_ferret/wiki/DrbServer

Hi, thanks for the great work! Does this work with Rails 1.1.6 as I
haven’t made the switch to 1.2.1 yet?

I didn’t test with 1.1.6 but it should work.

Jens

–
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

On 2/24/07, Jens K. [email protected] wrote:

cool :slight_smile:
what kind of app is this?

It’s a search application for several microformats available in the
Technorati Kitchen (http://kitchen.technorati.com/) (our ‘kitchen’ is
like
others’ ‘labs’).

-ryan

I’m using 1.1.6 using acts_as_ferret and DRb. It seems to work for
basic queries, but I’ve run into a problem when using sorting and using
the :limit and :offset options for pagination. I find that the query
results are no longer sorted by the sort field, and I seem to get the
same results irrespective of the :limit and :offset parameters. If I
don’t sort the results, the :limit and :offset parameters work as
expected when using DRb.

When I removed DRb from the setup, the sorting and pagination options
work as expected. Has anyone else come across this problem?

Sanjay

Jens K. wrote:

On Sun, Mar 04, 2007 at 08:59:26AM +0100, donut donut wrote:

Jens K. wrote:
[…]

I wrote some documentation to get you started with the remote indexing
stuff at
http://projects.jkraemer.net/acts_as_ferret/wiki/DrbServer

Hi, thanks for the great work! Does this work with Rails 1.1.6 as I
haven’t made the switch to 1.2.1 yet?

I didn’t test with 1.1.6 but it should work.

Jens

–
Jens Kr�mer
webit! Gesellschaft f�r neue Medien mbH
Schnorrstra�e 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

I did some further testing using the DRb setup. This time, I kept the
sort but removed the :limit and :offset options. The results were not
properly sorted and only 10 items were returned, even though there are
48 matching results and no limit imposed on the query. Then I removed
the DRb from the setup and the 48 results came back properly sorted in a
single query.

Sanjay

Sanjay Kapoor wrote:

I’m using 1.1.6 using acts_as_ferret and DRb. It seems to work for
basic queries, but I’ve run into a problem when using sorting and using
the :limit and :offset options for pagination. I find that the query
results are no longer sorted by the sort field, and I seem to get the
same results irrespective of the :limit and :offset parameters. If I
don’t sort the results, the :limit and :offset parameters work as
expected when using DRb.

When I removed DRb from the setup, the sorting and pagination options
work as expected. Has anyone else come across this problem?

Sanjay

Jens K. wrote:

On Sun, Mar 04, 2007 at 08:59:26AM +0100, donut donut wrote:

Jens K. wrote:
[…]

I wrote some documentation to get you started with the remote indexing
stuff at
http://projects.jkraemer.net/acts_as_ferret/wiki/DrbServer

Hi, thanks for the great work! Does this work with Rails 1.1.6 as I
haven’t made the switch to 1.2.1 yet?

I didn’t test with 1.1.6 but it should work.

Jens

–
Jens Kr�mer
webit! Gesellschaft f�r neue Medien mbH
Schnorrstra�e 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

Hi!

On Tue, Mar 06, 2007 at 06:39:13AM +0100, Sanjay Kapoor wrote:

I did some further testing using the DRb setup. This time, I kept the
sort but removed the :limit and :offset options. The results were not
properly sorted and only 10 items were returned, even though there are
48 matching results and no limit imposed on the query. Then I removed
the DRb from the setup and the 48 results came back properly sorted in a
single query.

Looks like you just found a bug in the DRb code - I’ll try to fix this
asap. Could you please post your call to find_by_contents, including the
construction of your SortFields?
The acts_as_ferret statement in your model might help, too.

About the number of results returned - are you sure you got all 48
results back from a call to find_by_contents without :limit parameter?
By default only 10 hits will be returned and you’ll need to pass
:limit => :all for aaf to give you all results. However
results.total_hits will give you the total number of results. Maybe only
that value is different with or without DRb?

Jens

–
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

My model’s full_text_search method looks like:

def self.full_text_search(q, options = {})
  return nil if q.nil? or q.empty?
  default_options = {:limit => 10, :page => 1}
  default_options.update options

  # get the offset based on the page and limit
  default_options[:page] = 1 if default_options[:page] == 0
  default_options[:offset] = default_options[:limit] * 

(default_options.delete(:page).to_i-1)

  # now do the query with options
  results = Article.find_by_contents(q, default_options)
  return [results.total_hits, results]
end

My search controller uses it like this:

build sort

@sf_published_at = Ferret::Search::SortField.new(:published_at_string,
:type => :string, :reverse => true)
@sort = Ferret::Search::Sort.new(@sf_published_at)

set options

@options = {:limit => 20, :sort => @sort}

@total, @articles = Article.full_text_search(@query_string, @options)

And yes, you’re absolutely right about sending in :limit => :all. I
forgot to mention that in my earlier post.

I’m currently getting around this issue by returning all the results and
sorting in ruby.

Thanks for looking into this.

Sanjay

Jens K. wrote:

Hi!

On Tue, Mar 06, 2007 at 06:39:13AM +0100, Sanjay Kapoor wrote:

I did some further testing using the DRb setup. This time, I kept the
sort but removed the :limit and :offset options. The results were not
properly sorted and only 10 items were returned, even though there are
48 matching results and no limit imposed on the query. Then I removed
the DRb from the setup and the 48 results came back properly sorted in a
single query.

Looks like you just found a bug in the DRb code - I’ll try to fix this
asap. Could you please post your call to find_by_contents, including the
construction of your SortFields?
The acts_as_ferret statement in your model might help, too.

About the number of results returned - are you sure you got all 48
results back from a call to find_by_contents without :limit parameter?
By default only 10 hits will be returned and you’ll need to pass
:limit => :all for aaf to give you all results. However
results.total_hits will give you the total number of results. Maybe only
that value is different with or without DRb?

Jens

–
Jens Kr�mer
webit! Gesellschaft f�r neue Medien mbH
Schnorrstra�e 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

Hi!

On Wed, Mar 07, 2007 at 06:31:37AM +0100, Sanjay Kapoor wrote:
[…]

My search controller uses it like this:

build sort

@sf_published_at = Ferret::Search::SortField.new(:published_at_string,
:type => :string, :reverse => true)
@sort = Ferret::Search::Sort.new(@sf_published_at)

set options

@options = {:limit => 20, :sort => @sort}

could you please try

@options = { :limit => 20, :sort => [ @sf_published_at ] }

instead? I have still some problems marshalling ferret classes, seems
the Sort class is one of them…
Using the array instead of a Sort instance works fine here (Ferret
0.11.2)

Jens

–
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa