Count frequency of term in a specific document?

Is there any way to count the frequency of specific term in one
document?
I can’t find any method… Do you?

On 4/6/07, James K. [email protected] wrote:

Is there any way to count the frequency of specific term in one
document?
I can’t find any method… Do you?

Hi James,

Caleb and I answered your previous post but here is my answer again.

You can find the frequency without storing term-vectors. Simply use
the TermDocEnum and skip to the document you are interested.

tde = index.reader.term_docs_for(:field, ‘term’)
tde.skip_to(100)

now check that we are at the correct document. If there are no

instances of ‘term’ in document 100 then it will skip to the next

document with an instance of the term ‘term’

frequency = tde.doc == 100 ? tde.freq : 0
puts “frequency of field:term in document 100 is #{frequency}”

Here is a full working example;

require ‘rubygems’
require ‘ferret’

index = Ferret::I.new
index << ‘one’
index << ‘one two one three one four one’ # doc 1
index << ‘one’
index << ‘no 1s’ # doc 3
index << ‘one’

def get_frequency(index, doc_num, term, field = :id)
tde = index.reader.term_docs_for(field, term)
tde.skip_to(doc_num)
return tde.doc == doc_num ? tde.freq : 0
end

puts get_frequency(index, 1, ‘one’) #=> 4
puts get_frequency(index, 3, ‘one’) #=> 0

Thanks!

One more. I’d like to count all term’s frequency on a specific document.
Upper solution is for count one term’s frequency. Is there any way to
gather all term’s frequency of a specific document?

On 4/7/07, James K. [email protected] wrote:

Thanks!

One more. I’d like to count all term’s frequency on a specific document.
Upper solution is for count one term’s frequency. Is there any way to
gather all term’s frequency of a specific document?

Yes, for this you need to store term-vectors with positions. That will
allow you to count the frequency of all terms in the document.