Search results autocompletion

Dear list,

I 'm using a text input field with autocompletion . The suggestions come
from a ferret index which is created by getting all the terms belonging
to
other indices. Here is the code:

class Suggestion

attr_accessor :term

def self.index(create)
[Person, Project, Orgunit].each{|kl|
terms = self.all_terms(kl)
terms.each{|term|
suggestion = Suggestion.new
suggestion.term = term
SUGGESTION_INDEX << suggestion.to_doc
}
}
SUGGESTION_INDEX.optimize
end

def self.all_terms(klass)
reader = Index::IndexReader.new(Object.const_get(klass.name.upcase +
“_INDEX_DIR”))
terms = []
begin
reader.field_names.each {|field_name|
term_enum = reader.terms(field_name)
begin
term = term_enum.term()
if !term.nil?
if klass::SUGGESTIONABLE_FIELDS.include?(field_name)
terms << term
end
end
end while term_enum.next?
}
ensure
reader.close
end
return terms
end

def to_doc
doc = {}
doc[:term] = self.term
return doc
end

end

It works very well except that the indexing process takes a long time.
Does
anybody knows if there’s a better way to do this?
Is there another way to get all the terms of an index?

Thank you.

Johan

Analyst Programmer
Belgian Biodiversity Platform ( http://www.biodiversity.be)
Belgian Federal Science Policy Office (http://www.belspo.be )
Tel:+32 2 650 5751 Fax: +32 2 650 5124

On Thu, Oct 05, 2006 at 07:58:40AM +0200, johan duflost wrote:

end
if !term.nil?
end
It works very well except that the indexing process takes a long time. Does
anybody knows if there’s a better way to do this?
Is there another way to get all the terms of an index?

Nothing ferret-related, but from the first look at it your code seems a
bit inefficient: you check the SUGGESTIONABLE_FIELDS array for each
term, instead of checking once and then going ahead. You even could just
iterate over the SUGGESTIONABLE_FIELDS array and use the field names
from there:

def self.all_terms(klass)
reader = Index::IndexReader.new(Object.const_get(klass.name.upcase
+
“_INDEX_DIR”))
terms = []
begin
klass::SUGGESTIONABLE_FIELDS.map { |field|
reader.terms(field)
}.each do |term_enum|
# term_enum.term should not be nil, so no need to check this.
terms << term_enum.term while term_enum.next?
end
ensure
reader.close
end
return terms
end

if your SUGGESTIONABLE_FIELDS contains fields not in the index (yet),
the
reader.terms call might fail, in that case
reader.terms(field) rescue nil
and compacting the result of map before calling each should work.

You further could save one iteration across all terms by yielding the
addition of the term to the index like this:

all_terms(klass) do |term|
INDEX << { :term => term }
end

all_terms should do
yield term_enum.term while term_enum.next?
in the inner loop then. For extra style points rename all_terms to
each_term :slight_smile:

cheers,
Jens


webit! Gesellschaft für neue Medien mbH www.webit.de
Dipl.-Wirtschaftsingenieur Jens Krämer [email protected]
Schnorrstraße 76 Tel +49 351 46766 0
D-01069 Dresden Fax +49 351 46766 66

Jens,

You are right, my code was not efficient I agree with you.

The indices from which I create the suggestions index are not very big:
80kb, 300kb and 2 Mb.

After 20 minutes, I get a suggestions index of 1400 kb approximately.

Thank you for your help,

Johan

----- Original Message -----
From: “Jens K.” [email protected]
To: [email protected]
Sent: Thursday, October 05, 2006 10:30 AM
Subject: Re: [Ferret-talk] search results autocompletion - Checked by
AntiVir DE

You’re right. In fact, I remove the terms’s accents before indexing
them.
Without this piece of code, it takes ‘only’ 6 minutes.

----- Original Message -----
From: “Jens K.” [email protected]
To: [email protected]
Sent: Friday, October 06, 2006 10:23 AM
Subject: Re: [Ferret-talk] search results autocompletion - Checked by
AntiVir DE

On Thu, Oct 05, 2006 at 03:56:43PM +0200, johan duflost wrote:
[…]

The indices from which I create the suggestions index are not very big:
80kb, 300kb and 2 Mb.

After 20 minutes, I get a suggestions index of 1400 kb approximately.

still looks somewhat slow to me…

Jens


webit! Gesellschaft für neue Medien mbH www.webit.de
Dipl.-Wirtschaftsingenieur Jens Krämer [email protected]
Schnorrstraße 76 Tel +49 351 46766 0
D-01069 Dresden Fax +49 351 46766 66