How can I count frequency of terms in a document?

Hi, there.

I need some help.
Is there a way to count frequencies of terms in a document on Ferret?
I know that Ferret has IndexReader#terms_docs_for method which counts
all documents.
I need to count frequencies of terms in a specific document.

Some way??

On 4/4/07, Caleb C. [email protected] wrote:

frequency.
This is definitely one way of doing it. You can also find the
frequency without storing term-vectors. Simply use the TermDocEnum and
skip to the document you are interested.

tde = index.reader.term_docs_for(:field, ‘term’)
tde.skip_to(100)

now check that we are at the correct document. If there are no

instances of ‘term’ in document 100 then it will skip to the next

document with an instance of the term ‘term’

frequency = tde.doc == 100 ? tde.freq : 0
puts “frequency of field:term in document 100 is #{frequency}”

Here is a full working example;

require 'rubygems'
require 'ferret'

index = Ferret::I.new
index << 'one'
index << 'one two one three one four one' # doc 1
index << 'one'
index << 'no 1s'                          # doc 3
index << 'one'

def get_frequency(index, doc_num, term, field = :id)
  tde = index.reader.term_docs_for(field, term)
  tde.skip_to(doc_num)
  return tde.doc == doc_num ? tde.freq : 0
end

puts get_frequency(index, 1, 'one') #=> 4
puts get_frequency(index, 3, 'one') #=> 0

James K. wrote:

Is there a way to count frequencies of terms in a document on Ferret?
I know that Ferret has IndexReader#terms_docs_for method which counts
all documents.
I need to count frequencies of terms in a specific document.

I believe that IndexReader#term_vector is the method that you’re looking
for. This gives you some information about each term in one document…
If you stored of positions when you indexed, the individual terms will
have a list of positions associated. The size of that list is the term
frequency.