Indexing a document object fails


#1

Hi,

I’m trying out the example (more or less) straight from the tutorial:

doc = Document.new
doc << Field.new("id",    "a",    Field::Store::NO,

Field::Index::UNTOKENIZED)
doc << Field.new(“title”, “b”, Field::Store::YES,
Field::Index::UNTOKENIZED)
doc << Field.new(“data”, “c”, Field::Store::YES,
Field::Index::TOKENIZED)
doc << Field.new(“image”, “d”, Field::Store::YES, Field::Index::NO)
index << doc

And I get:

Exception: Unknown document type Ferret::Document::Document
C:/dev/workspace/fred/config/…/vendor/ferret/index/index.rb:259:in <<' C:/dev/workspace/fred/config/../vendor/ferret/index/index.rb:238:insynchronize’
C:/dev/workspace/fred/config/…/vendor/ferret/index/index.rb:238:in <<' C:\dev\workspace\fred/test/unit/user_test.rb:56:intest_index_document_sanity_check’

Which appears to be caused because

elsif doc.is_a?(Document)

is expecting a Ferret::Document rather than
Ferret::Document::Document. When I change this line to

elsif doc.is_a?(Document::Document)

I get past the indexing part, and am able to retrieve the
document…i.e.

index.search_each("*") do |score_doc, score|
p index.doc(score_doc)
end

which results in
#<Ferret::Document::Document:0x34d0570
@fields={“title”=>[#<Ferret::Document::Field:0x34cfe80
@tokenized=false, @stored=true, @name=“title”, @data=“b”,
@store_offset=false, @store_term_vector=false, @binary=false,
@boost=1.0, @indexed=true, @omit_norms=false, @store_position=false,
@compressed=false>], “image”=>[#<Ferret::Document::Field:0x34cfc88
@tokenized=false, @stored=true, @name=“image”, @data=“d”,
@store_offset=false, @store_term_vector=false, @binary=false,
@boost=1.0, @indexed=false, @omit_norms=false, @store_position=false,
@compressed=false>], “data”=>[#<Ferret::Document::Field:0x34cfbc8
@tokenized=true, @stored=true, @name=“data”, @data=“c”,
@store_offset=false, @store_term_vector=false, @binary=false,
@boost=1.0, @indexed=true, @omit_norms=false, @store_position=false,
@compressed=false>]}, @boost=1.0>

Also, what’s the significance of score_doc? It appears to be just the
doc id. Am I missing something?

Thanks,
Steve


#2

On 3/9/06, Steve C. removed_email_address@domain.invalid wrote:

index << doc

Which appears to be caused because

elsif doc.is_a?(Document)

is expecting a Ferret::Document rather than
Ferret::Document::Document. When I change this line to

elsif doc.is_a?(Document::Document)

Funny, I can’t duplicate this but I’ll change it anyway as it won’t
hurt.

Also, what’s the significance of score_doc? It appears to be just the
doc id. Am I missing something?

No, you’re not missing anything. It’s just the document id. Just to be
clear, this is the internal id used by ferret to access the document,
not any id that you add to the document yourself.