Forum: Ferret Index::Index.new vs. Readers and Writers

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
8194edd5ad2d97cac9d4f04d2595dfcc?d=identicon&s=25 Shanti Braford (sbraford)
on 2006-05-09 00:35
Hey gang,

A post on the Rails forum a while back had it sound like you pretty much
had to use the Index Readers & Writers if you were going to be
potentially accessing an index from more than one process.  (i.e.
multiple dispatch.fcgi's, etc)

Is this still the case, or does the main Index class do that black magic
behind the scenes?  =)

I was having trouble implementing the Readers & Writers so I thought I'd
post an example stub of what I have here.  Any feedback would be much
appreciated.

# Non-Reader/Writer Example - Main Index::Index.new only
#   works like a charm but haven't tried firing up a bunch to see if we
get IO blocks.

require 'ferret'

class SearchEngine
  include Ferret
  include Ferret::Document

  def self.get_index()
    index_dir = "/var/search/index"

    index = Index::Index.new(:path => index_dir,
                           :create_if_missing => true)
    return index
  end
end

# Reader/Writer Example

require 'ferret'

class SearchEngine
  include Ferret
  include Ferret::Document


  # Creates or returns an existing index for an organization
  def self.get_index(type = 'writer')
	  index_dir = "/var/search/index"
    if type == 'writer'
      index = Index::IndexWriter.new(index_dir,
                                    :create_if_missing => true)
    elsif type == 'reader'
      index = Index::IndexReader.open(index_dir, false)
    end
    return index
  end
end


Thanks!!

- Shanti
B5e329ffa0cc78efbfc7ae2d084c149f?d=identicon&s=25 David Balmain (Guest)
on 2006-05-09 03:23
(Received via mailing list)
Hi Shanti,

When you have multi processes accessing the index, it's not a matter
of which class you use but how many processes you have writing to the
index. The recommended way to do things is to have only one process
writing to the index. You can have as many index readers open as you
like. The trouble is that the IndexWriter opens a commit lock on the
index. If another IndexWriter comes along and tries to open the lock
at the same time it will raise an exception. The same thing goes for
using the Index class as it is just really a simple interface to the
IndexWriter and IndexReader classes.

One possibility is to use the Index class with :autoflush set to true.
This should work most of the time as the IndexWriter class will keep
trying for 5 seconds (broken in C version of 0.9.0, 0.9.1) to gain the
commit lock so if it misses the first time it should eventually get
it. This is an easy way to do things but it's still dangerous. I'd
recommend using a single IndexWriter as described above. That doesn't
mean you have to use the IndexWriter and IndexReader classes. You can
still use the Index class as long as only one Index is doing the
writing.

I hope that helps. Stay tuned for much better documentation on this.

Dave
8194edd5ad2d97cac9d4f04d2595dfcc?d=identicon&s=25 Shanti Braford (sbraford)
on 2006-05-10 04:09
Hi David,

Thanks for the heads up re: index readers & writers.

Just one more question:  how do you search an Index in read-only mode?

The :autoflush option sounds like a viable backup scenario as well, but
I couldn't find anything in the docs about it.  (tried passing it into
index via something like: Index::Index.new(:autoflush => true) but it
dodn't like that either)

Cheers,

- Shanti

David Balmain wrote:
> Hi Shanti,
>
> When you have multi processes accessing the index, it's not a matter
> of which class you use but how many processes you have writing to the
> index. The recommended way to do things is to have only one process
> writing to the index. You can have as many index readers open as you
> like. The trouble is that the IndexWriter opens a commit lock on the
> index. If another IndexWriter comes along and tries to open the lock
> at the same time it will raise an exception. The same thing goes for
> using the Index class as it is just really a simple interface to the
> IndexWriter and IndexReader classes.
>
> One possibility is to use the Index class with :autoflush set to true.
> This should work most of the time as the IndexWriter class will keep
> trying for 5 seconds (broken in C version of 0.9.0, 0.9.1) to gain the
> commit lock so if it misses the first time it should eventually get
> it. This is an easy way to do things but it's still dangerous. I'd
> recommend using a single IndexWriter as described above. That doesn't
> mean you have to use the IndexWriter and IndexReader classes. You can
> still use the Index class as long as only one Index is doing the
> writing.
>
> I hope that helps. Stay tuned for much better documentation on this.
>
> Dave
B5e329ffa0cc78efbfc7ae2d084c149f?d=identicon&s=25 David Balmain (Guest)
on 2006-05-10 04:49
(Received via mailing list)
Hi Shanti,

It's :auto_flush, not :autoflush. Sorry for the confusion.

Dave
This topic is locked and can not be replied to.