Indexed Search Engine & Boolean Search

Hi
I’ve got the Indexed Search engine up and running, but it currently
returns results for all items matching any of the terms in the search
field.

I was wondering if anyone knew whether there’s a switch I’m missing so
that it returns all items matching every terms in the search field?

I’ve been playing around with the IndexableRecord.rb code (l.100 on):

Search the index. Phrase is expected to be a string of

one or more words. The phrase will be tokenized with Tokenizer,

stop words will be removed, and each term will be stemmed.

This will allow a search for “Running” to return records with

the terms “run”, “ran” and “runner”

def IndexableRecord.search(phrase)
results = []
terms = Tokenizer.tokenize(phrase).collect do |t| t.stem end
terms.empty? ? [] : Context.find(:all, :include => :terms,
:conditions => [“term IN (?)”, terms])
end

but am getting in a complete spin trying to sort out the conditions for
a boolean and, partly because MySQL doesn’t seem to support intersects.

Any help/suggestions/ideas would be hugely appreciated!

Piers

Hi Piers

I’ve given this some thought, and I’m not sure of the best way to
achieve it
at the moment without intersection support. I had been kind of
procrastinating doing something like this until there was some demand -
and
you are the first person to request it. :wink:

A very basic (and not very efficient) implementation would be to iterate
over the terms storing the results in an array and using ruby’s
intersection
operator (&) to find the set of matches for all terms.

I’ll noodle on this a bit - and if you have other ideas, please feel
free to
email or post them.

Thanks
Lance

Lance,

A very basic (and not very efficient) implementation would be to iterate
over the terms storing the results in an array and using ruby’s
intersection operator (&) to find the set of matches for all terms.

If you can see past the novice coding, then the following seems to work.
As you say though, there are almost certainly more graceful ways of
doing it.

def IndexableRecord.search(phrase)
results = []
terms = Tokenizer.tokenize(phrase).collect do |t| t.stem end
#terms.empty? ? [] : Context.find(:all, :include => :terms,
# :conditions => [“term IN (?)”, terms])

if terms.empty?
  return results
else
  first_term = true
  holder =[]
  terms.each do |t|
    @query = "Select contexts.id from contexts, contexts_terms, 

terms where
terms.term = ‘#{t}’ and terms.id =
contexts_terms.term_id and
contexts_terms.context_id = contexts.id"

    if first_term
      first_results = Context.find_by_sql(@query)
      first_term = false
      holder = first_results
    else
      next_results = Context.find_by_sql(@query)
      holder = holder & next_results
    end
    if holder.empty?
      return results
    end
  end

  holder.each do |h|
    results << Context.find(h.id)
  end
  return results
end

end

Hi Piers

I’ve made modifications to the code so that you may now supply a :type
option to the search
method, as in IndexableRecord::search(“foo bar”, :type => :all). The
:type
can option can be either :any or :all with the default being :any. When
set
to :any, results containing any of the terms in the search phrase are
returned. When set to :all results containing all of the terms in the
search
phrase are returned.

This is in version 0.2.1

The implementation for the :all option looks like this:

results = Context.find(:all, :include => :terms,
                       :conditions => ["term = ?", terms.shift])
terms.each do |term|
  results = results & Context.find(:all, :include => :terms,
                          :conditions => ["term = ?", term])
end

Not as efficient as I’d like, but OK I suppose.

Thanks Lance

Not as efficient as I’d like, but OK I suppose.

Blindingly much better than my effort :slight_smile:

Piers