Corrupt index immediately after rebuild

Hello,

I’m usin gferret and I’ve just attempted to build an index that contains
15,968,046 documents. I’ve rebuild the index from scratch, but when I
try to search for some items I get this error:

IOError: IO Error occured at <except.c>:79 in xraise
Error occured in fs_store.c:289 - fsi_seek_i
seeking pos -1284143798:

This is happening when I’m trying to look up a document with id
13,677,803. Interestingly, any document after id 12,098,067 seems to
trigger the error.

Any ideas?

Thanks!
-Mike

On Thu, Jan 18, 2007 at 07:42:02AM +0100, Joe Mestople wrote:

This is happening when I’m trying to look up a document with id
13,677,803. Interestingly, any document after id 12,098,067 seems to
trigger the error.

Any ideas?

maybe you hit some file size limit with your index? How large is it?

Jens


webit! Gesellschaft für neue Medien mbH www.webit.de
Dipl.-Wirtschaftsingenieur Jens Krämer [email protected]
Schnorrstraße 76 Tel +49 351 46766 0
D-01069 Dresden Fax +49 351 46766 66

Excerpts from Jens K.'s message of Thu Jan 18 01:32:57 -0800 2007:

maybe you hit some file size limit with your index?

Also check to make sure you didn’t just run out of disk space.

Perhaps it’s the same problem as in this post:

http://www.ruby-forum.com/topic/84237#151791

There is a 2GB limit to a single index file if you don’t compile Ferret
with large-file support.

An alternative is to use :max_merge_docs to stop index merging when
segments reaches a certain size. Like this:

index = Index::Index.new(:path => “path”,
:max_merge_docs => 150000)

/David W.

Joe Mestople wrote:

William M. wrote:

Excerpts from Jens K.'s message of Thu Jan 18 01:32:57 -0800 2007:

maybe you hit some file size limit with your index?

Also check to make sure you didn’t just run out of disk space.

file size is 3,711,610,109 bytes – the volume is ext3 and it has 74%
available so I don’t think it’s either running out of space or exceeding
the maximum file size.

Has anyone else ran into a similar problem?

I know that the the indexer hits a 2gig file limit (per file), which is
a
limit because of how ferret is compiled (I believe).

What we’ve done to offset this was to, when indexing, we optimize the
index
every so often, so that we never hit this limit. (because the optimized
file
size is quite smaller than unoptimized).

How many documents do you have indexed?

William M. wrote:

Excerpts from Jens K.'s message of Thu Jan 18 01:32:57 -0800 2007:

maybe you hit some file size limit with your index?

Also check to make sure you didn’t just run out of disk space.

file size is 3,711,610,109 bytes – the volume is ext3 and it has 74%
available so I don’t think it’s either running out of space or exceeding
the maximum file size.

Has anyone else ran into a similar problem?