Hi, We have an index of around 1M web pages as part of our web app. The app uses ferret by way of RDig to perform searches. We have noticed anecdotally that some searches don't work the way we thought they should, as if documents were missing from the index. Yesterday we came upon a concrete instance of this. Our documents have several fields, one of which is called :keywords and another called :data, both of which are used for searching. We isolated a single document that is not found on the web app by terms in the :data field, but which can be found by the terms in its :keywords field. We assumed first that a problem occurred in the indexing which resulted in the :data field being lost. However, the index browser that's included with version 0.11.4 showed the document with all its fields intact, including the :data field. All the :data field terms that failed to retrieve the document on the web app were indeed present, according to the browser. We then built a short script with the API that instantiated an IndexReader and called IndexReader.term_vectors() with the id of our subject doc. The term_vectors returned included a vector for :keywords, but not for :data. Somehow the core API funcs are not finding this document's :data field when the 0.11.4 browser is. Are there differences between the two that would explain this? Does this problem description ring a bell with anyone out there? Many thanks.
on 2007-06-12 17:33