Need clarification of documentation

Hi, I have question about the delete() method docs.

I am re-indexing data on the fly so I would like to delete any existing
indexed data for a particular resource before re-indexing it using
index.delete(id).

The delete() method api doc says:

"Delete the document referenced by the document number id if id is an
integer or all of the documents which have the term id if id is a term…

id: The number of the document to delete"

I am a little confused by what this means. At the time of deletion all
I have is my own ID of the resource which was previously indexed in
ferret with my own field :id. If I supply my own ID will the correct
indexed data be deleted? Or does this ID refer to ferrets own internal
ID for the resource?

One other question while I am on the subject - will deleting a resource
that does not exist raise an error. I ask this because I would like to
index new data structures that haven’t been indexed before and would
like to avoid checking in the index first whether or not it exists
before attempting to delete.

Thanks,

Chad.

On 3/5/07, Chad T. [email protected] wrote:

integer or all of the documents which have the term id if id is a term…

id: The number of the document to delete"

I am a little confused by what this means.

Is this any clearer?

# Deletes a document/documents from the index. The method for 

determining
# the document to delete depends on the type of the argument passed.
#
# If +arg+ is an Integer then delete the document based on the
internal
# document number.
#
# If +arg+ is a String then search for the documents with +arg+ in
the
# +id+ field. The +id+ field is either :id or whatever you set the
:id_field
# parameter to when you create the Index object.

At the time of deletion all
I have is my own ID of the resource which was previously indexed in
ferret with my own field :id. If I supply my own ID will the correct
indexed data be deleted? Or does this ID refer to ferrets own internal
ID for the resource?

In this case, since your id is probably an integer you will need to
convert it to a string or Ferret will delete the documents by internal
document number rather than your own ID for the resource.

One other question while I am on the subject - will deleting a resource
that does not exist raise an error. I ask this because I would like to
index new data structures that haven’t been indexed before and would
like to avoid checking in the index first whether or not it exists
before attempting to delete.

Yes, if you delete by internal document number. No, if you are
deleting by term, ie passing your own document id which is stored in
the id field. So in your case you should be fine. I should also
mention that you can set the :key parameter to :id;

index = Ferret::Index::Index.new(:key => :id)

This way, whenever you add a document with an id that already exists
in the index it will replace the existing document.

For example;

require 'rubygems'
require 'ferret'

index = Ferret::I.new(:key => :id)

[
  {:id => '1', :text => 'one'},
  {:id => '2', :text => 'Two'},
  {:id => '3', :text => 'Three'},
  {:id => '1', :text => 'One'}
].each {|doc| index << doc}

puts index.size                       # => 3
puts index['1'].load.inspect          # => {:text=>"One", :id=>"1"}
puts index.search('id:1').to_s(:text)
    # => TopDocs: total_hits = 1, max_score = 1.287682 [
    #            3 "One": 1.287682
    #    ]

Hope that helps,
Dave

Thanks Dave, that has cleared everything up for me. Excellent engine by
the way, thanks for all your hard work on this.

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs