28c237c0c414b644082bfcde4e42b309?d=identicon&s=25 John Leach (Guest)
on 2007-01-27 14:57
(Received via mailing list)

I'm adding some news articles to a keyed Ferret 0.10.14 index and
encountering quite serious instability when concurrently reading and
writing to the index, even though with just 1 writer and 1 reader

If I recreate the index without a key, concurrent reading and writing
seem to work fine (and indexing is about 10 times quicker :)

I'm testing by running my indexing script (which retrieves up to 1000
database records using ActiveRecord, adds to the index and exits) and
concurrently manually re-running a search on the index using my Rails
web interface.  This is in a dev environment with only 1 user (me) and
about 58000 docs.

The error I get is along the lines of the following, with a different
filename each time:

IO Error occured at <except.c>:79 in xraise
Error occured in fs_store.c:324 - fs_open_input
  couldn'ferret_index/development/news_article_versions/_2ih.tix: <No
such file or directory>

/usr/lib/ruby/1.8/monitor.rb:229:in `synchronize'
#{RAILS_ROOT}/app/models/news_article_version.rb:35:in `ferret_search'
#{RAILS_ROOT}/app/models/news_article_version.rb:35:in `ferret_search'
#{RAILS_ROOT}/app/controllers/news_articles_controller.rb:56:in `search'

It seems to occur roughly once per batch, and usually towards the end of
the batch. I'm not using aaf.  I create my keyed index like this:

@@ferret_index = =>
  :field_infos => field_infos,
  :id_field => :id,
  :key => :id,
  :default_input_field => :text)

Unkeyed, I just drop the :key option (duh).  :id is just the
ActiveRecord id, from an auto_increment field in MySQL.

As a note, when concurrently searching on the keyed index, the number of
hits returned increases throughout the indexing process.  With a
non-keyed index, the number of hits doesn't increase until the end.

It looks to me that when using a keyed index, Ferret commits each record
added.  When non-keyed, it commits when the Index is closed.  That I
don't get the error with non-keyed might just be because there are less
commits, so less opportunities for the "bug" to trigger.

Is this is bug I've come across?  Is concurrent reading/writing like
this expected to work?

I'm using Ferret 0.10.14 on Ubuntu Edgy, with "ruby 1.8.4 (2005-12-24)
[i486-linux]" and "gcc version 4.1.2 20060928"

Thanks in advance!

