Forum: Ferret more_like_this

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Rob L. (Guest)
on 2007-05-09 14:59
Hi,

I'm using acts_as_ferret in my rails application and I'd like to use
more_like_this to retrieve some 'similar' item suggestions.  I have a
class 'items' which has a status field and I need to retrieve items that
only have one of the two possible statuses.

Looking at the more_like_this method indicates it supports an
:append_to_query option that allows you to specify a proc that will
modify the query object before the query is 'run'.  This would seem to
allow me to specify extra conditions to the query (such as
+status:live).

Item.more_like_this(:field_names => [:title, :description, :status],
:append_to_query => Proc .... )

It's a little unclear exactly what the query object is and there seem to
be no examples I can find outlining how to use this functionality, does
anybody have an example they could contribute ?

Thanks
Jens K. (Guest)
on 2007-05-09 15:58
(Received via mailing list)
On Wed, May 09, 2007 at 12:59:44PM +0200, Rob L. wrote:
> allow me to specify extra conditions to the query (such as
> +status:live).
>
> Item.more_like_this(:field_names => [:title, :description, :status],
> :append_to_query => Proc .... )
>
> It's a little unclear exactly what the query object is and there seem to
> be no examples I can find outlining how to use this functionality, does
> anybody have an example they could contribute ?

I don't have an exampla at hand, but maybe I can help anyway. The Proc
parameter
is a BooleanQuery instance. You can add your own conditions to this by
adding your own Query to this:

query.add_query(Ferret::Search::TermQuery.new(:status, 'live'), :must)


Jens

--
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
removed_email_address@domain.invalid | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
Jacob R. (Guest)
on 2007-05-15 19:37
Rob L. wrote:

>
> Item.more_like_this(:field_names => [:title, :description, :status],
> :append_to_query => Proc .... )
>

I don't mean to be nitpicky but more_like_this is an instance method not
a class method. This has come up for me because more_like_this does not
work for unsaved records in the current AAF which doesn't mesh with the
rails convention of creating a new active record object to store user
query params. I'd like to make a regular rails form using a blank object
and then call more_like_this on that object to do a search.
Jens K. (Guest)
on 2007-05-15 21:59
(Received via mailing list)
On Tue, May 15, 2007 at 05:37:19PM +0200, Jacob Robbins wrote:
> rails convention of creating a new active record object to store user
> query params. I'd like to make a regular rails form using a blank object
> and then call more_like_this on that object to do a search.

This isn't supported by aaf but should be possible to do with a bit of
hacking :)

It'll get a bit harder if you want to do this with the DRb server, since
then you'll have to transfer your unsaved record over to the server for
the more_like_this query to be built. Atm only id and class name
are transferred with method calls.

Jens

--
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
removed_email_address@domain.invalid | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
Jacob R. (Guest)
on 2007-05-15 23:08
Jens K. wrote:
> On Tue, May 15, 2007 at 05:37:19PM +0200, Jacob Robbins wrote:
>> rails convention of creating a new active record object to store user
>> query params. I'd like to make a regular rails form using a blank object
>> and then call more_like_this on that object to do a search.
>
> This isn't supported by aaf but should be possible to do with a bit of
> hacking :)
>
> It'll get a bit harder if you want to do this with the DRb server, since
> then you'll have to transfer your unsaved record over to the server for
> the more_like_this query to be built. Atm only id and class name
> are transferred with method calls.
>
> Jens

Thanks for checking into this Jens, i've done what i wanted by adding an
instance method to aaf. In instance_methods.rb, right after the to_doc
method, i added a to_ferret_query method. This avoids transfering the
whole object when using the DRb server. Tell me what you think...

>>>>>>>>>>>>>>>>>>>
    # Turn this instance into a ferret query derived from its field
values.
    # Empty fields are ignored. Can be used on unsaved records. Typical
use is to make
    # ferret query from a new object initialized from posted form
values.
    #
    # Example: college.to_query(:fuzz => 0.6)
    #          #=> "name:seattle~0.6 and name:university~0.6 and
city:seattle~0.6"
    #
    #
    # === Options
    #
    # fuzz::           Default: nil. Float value for fuzziness to attach
to search terms.
    # field_names::    Default: nil. (uses ferret indexed fields) Array
of field names to use in query.
    # join_type::      Default: 'and'. String used to join query terms.
    # exclude::        Default: ['and', 'or']. Array of words to ignore
in field values.
    def to_ferret_query(options = {})
      options = {
        :field_names =>
self.class.aaf_configuration[:ferret_fields].keys,
        :join_type => 'and',
        :exclude => ['and','or']
      }.update(options)
      terms = []
      options[:field_names].each do |field|
        if val = self.send(field)
          val.to_s.split.each do |word|
            unless options[:exclude].include?(word.strip.downcase)
              terms << field.to_s + ':' + word + ( options[:fuzz] ? '~'
+ options[:fuzz].to_s : '' )
            end
          end
        end
      end
      terms.join ' ' + options[:join_type] + ' '
    end
<<<<<<<<<<<<<<<<<<<<<<<<<<
Jens K. (Guest)
on 2007-05-16 15:56
(Received via mailing list)
On Tue, May 15, 2007 at 09:08:06PM +0200, Jacob Robbins wrote:
> > then you'll have to transfer your unsaved record over to the server for
> > the more_like_this query to be built. Atm only id and class name
> > are transferred with method calls.
> >
> > Jens
>
> Thanks for checking into this Jens, i've done what i wanted by adding an
> instance method to aaf. In instance_methods.rb, right after the to_doc
> method, i added a to_ferret_query method. This avoids transfering the
> whole object when using the DRb server. Tell me what you think...

Perfectly fine if it works for you.

Aaf's more_like_this is more complicated, mainly because it tries to
find out the 15 or so most relevant terms of your record's content to
construct the query to support large documents (and it can even boost
these single terms according to their relevance).

I'll look into refactoring aaf a bit so that in future versions
more_like_this can be used on unsaved records, too.

Jens

--
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
removed_email_address@domain.invalid | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
Jacob R. (Guest)
on 2007-05-17 00:08
> Aaf's more_like_this is more complicated, mainly because it tries to
> find out the 15 or so most relevant terms of your record's content to
> construct the query to support large documents (and it can even boost
> these single terms according to their relevance).
>

Oh, now i get it. Yeah i run into this a lot with my deployment because
we don't index big documents and most of ferret is geared for them. I
use ferret to help users find bands, recordings and labels that are
commonly mispelled. So for me... fuzzy searching: good, stopwords: bad.
This topic is locked and can not be replied to.