Forum: Ferret Using ID as Key

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Tom D. (Guest)
on 2006-01-27 15:13
(Received via mailing list)
Hi,

I followed the howto to use keys for documents:

http://ferret.davebalmain.com/trac/wiki/HowTos#How...

If I add two documents with the same id, only one gets added to the
index as expected.  However, I have found the key and id do not match.
 So, attempting to access the index with the id does not work.

For instance, when I run this search:

    INDEX.search_each(query) do |doc, score|
      logger.debug("Found doc: #{doc}, id: #{INDEX[doc]['id']}")
    end

The following is output:

Found doc: 3, id: 69
Found doc: 17, id: 88

Is this as designed or am I missing something?

Thanks,
Tom
Erik H. (Guest)
on 2006-01-27 20:34
(Received via mailing list)
On Jan 27, 2006, at 8:10 AM, Tom D. wrote:
>     INDEX.search_each(query) do |doc, score|
>       logger.debug("Found doc: #{doc}, id: #{INDEX[doc]['id']}")
>     end
>
> The following is output:
>
> Found doc: 3, id: 69
> Found doc: 17, id: 88
>
> Is this as designed or am I missing something?

The doc variable in your code is what is known in Lucene as the
document "id".  This is an internal number used by the index.  It has
no relation to the primary key feature that Ferret adds.  You've
called your field "id", which confuses things a bit.

The document id is subject to change, if documents are deleted in the
middle and the index is optimized.  So don't rely on the internal
number for anything long-term.

	Erik
Tom D. (Guest)
on 2006-01-28 00:12
(Received via mailing list)
Hi Erik,

Thanks for your response.  Perhaps I am misunderstanding the how to,
but it implies that when you create an index and map the key to the id
as follows:

  index = Index::Index.new(:key => :id)
  index << {:id => 23, :data => "This is the data..."}
  index << {:id => 23, :data => "This is the new data..."}

Then you can access this document by using either of the following:

  index["23"] #Get document with key 23
  index[23] #Get document with internal number 23. It is NOT key
field. It is just internal Ferret id.

This implies that the id and key are the same, but according to my
first email example, they are not.  Is this howto just misleading?
Based on what you said, the internal number will not necessarily match
the key.

Tom
David B. (Guest)
on 2006-01-28 06:10
(Received via mailing list)
Hi Tom,

I can see how this would be confusing. The internal id and the id you
give a document are unrelated and they'll only be the same like this
when you add documents in order starting with id 0. I'll change the
howto to remove the confusion.

Cheers,
Dave
This topic is locked and can not be replied to.