Indexing tuples (example: "frog" => 123) as opposed to words

Hi,

I need to map words in a document back to there original word id’s in my
database. For example, if I had the sentence “I eat food” and I was
searching for “food” I would obviously get the document back as a
result. For my particular problem I need to not only get the document id
but also the id of the match.

Suppose my original sentence was actually represented as tuples where
each word contains the word and an id number. The new structure would
look something like “(I => 1, eat => 2, food => 3)”. In this case I
would actually like to not only get the document but also the document
id number of the match, which in this case would be “3”. Is there any
way to do this in Ferret?

FYI, my goal is to be able to go back an locate the words matches for
further analysis.

Any help would be greatly appreciated. Thank you.

Benjamin A.
Tabbec LLC

Hey …

I’m not quite sure i fully understand what you’re going to achieve, but
maybe TermVectors will help you.

http://ferret.davebalmain.com/api/classes/Ferret/Index/TermVector.html

http://ferret.davebalmain.com/trac/wiki/FAQ%3ADefinitions#Whatisaterm-vector
http://ferret.davebalmain.com/trac/wiki/FAQ%3ADefinitions#Whatareterm-offsets

Ben

On 2007-12-20, at 09:38, Benjamin A. wrote:

Suppose my original sentence was actually represented as tuples where

Benjamin A.
Tabbec LLC


Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Gruss
Ben

Benjamin K.

[email protected]