Strange error. Index corrupt on production server

We’ve been running Ferret for a few months on our site with great
result. But, just a monent ago the index suddenly became corrupt.

It all started with this error message:

:108250 is out of range [0…108183] for IndexWriter#[]
/usr/local/lib/ruby/gems/1.8/gems/ferret-0.10.9/lib/ferret/index.rb:382:in
`[]’

And after that every search resulted in this error:

A IOError occurred in search#rss:

IO Error occured at <except.c>:79 in xraise
Error occured in fs_store.c:323 - fs_open_input
couldn’t create InStream
/home/newsdesk_prod/current/config/…/index/production/pressrelease/_2tap.fdt:

/usr/local/lib/ruby/gems/1.8/gems/ferret-0.10.9/lib/ferret/index.rb:679:in
`initialize’

I couldn’t find any other solution than rebuilding the index, which
takes a few hours…

Has anyone experienced anything similar? Is there any way to repair the
index without rebuilding it?

Thanks a lot for any help or advice!

/David W.

Hi David,

The same thing just happened to me yesterday. The response I received
on this list is that there is no way to fix it other than rebuilding
the index. As a result, I had to disable searching on my site as I
look for an alternative. I personally am looking into another
solution for searching since I can’t afford to have my index become
corrupt even once. Ferret is great though when it works so I may
revisit it in the future.

Good luck,
Tom

On 11/24/06, David W. [email protected] wrote:


Posted via http://www.ruby-forum.com/.


Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk


Tom D.

http://atomgiant.com
http://gifthat.com

Excerpts from David W.'s mail of 24 Nov 2006 (PST):

And after that every search resulted in this error:

A IOError occurred in search#rss:

IO Error occured at <except.c>:79 in xraise
Error occured in fs_store.c:323 - fs_open_input
couldn’t create InStream
/home/newsdesk_prod/current/config/…/index/production/pressrelease/_2tap.fdt:

I’ve now encountered this error as well. For me, rebuilding the index is
also not an option, not because I don’t have the time, but because I
have significant state stored in the index but which doesn’t exist
anywhere on disk. (Although how good of an idea that is is admittedly
arguable.) For example, the user-assigned labels and the read/unread
status of each message in Sup are stored in the index and nowhere else.

I believe this problem occurs when Ferret crashes in the middle of
writing segment info out to disk. The segment is recorded as existing in
$INDEX/segments, but not all the actual corresponding files are written
to disk.

The attached patch allows Ferret to recover from this occurrence, at the
expense of whatever documents were involved in the broken segments. I
don’t know if it’s possible to recover those. In my case I had five
segments involved with one document each. Whether that is typical or not
I don’t know.

Dave B: is it sufficient to check for the existence of a .cfs file for a
valid segment? In my case it worked, but I don’t understand much of
what’s under the hood here.

Index: src/index.c

— src/index.c (revision 686)
+++ src/index.c (working copy)
@@ -449,6 +449,13 @@
free(si);
}

+bool si_valid_on_disk(const char *name, Store *store) {

  • char file_name[SEGMENT_NAME_MAX_LENGTH];
  • sprintf(file_name, “%s.cfs”, name);
  • return store->exists(store, file_name);
    +}

bool si_has_deletions(SegmentInfo *si)
{
char del_file_name[SEGMENT_NAME_MAX_LENGTH];
@@ -621,7 +628,12 @@
for (i = 0; i < seg_cnt; i++) {
name = is_read_string(is);
doc_cnt = is_read_vint(is);

  •    sis_add_si(sis, si_new(name, doc_cnt, store));
    
  •    if (si_valid_on_disk(name, store)) {
    
  •        sis_add_si(sis, si_new(name, doc_cnt, store));
    
  •    }
    
  •    else {
    
  •        fprintf(stderr, "WARNING: error opening segment %s (%d 
    

docs); ignoring\n", name, doc_cnt);

  •    }
    
    }
    is_close(is);

Index: include/index.h

— include/index.h (revision 686)
+++ include/index.h (working copy)
@@ -169,6 +169,7 @@

extern SegmentInfo *si_new(char *name, int doc_cnt, Store *store);
extern void si_destroy(SegmentInfo *si);
+extern bool si_valid_on_disk(const char *name, Store *store);
extern bool si_has_deletions(SegmentInfo *si);
extern bool si_uses_compound_file(SegmentInfo *si);
extern bool si_has_separate_norms(SegmentInfo *si);