Forum: Ferret Corrupt index and segfaults with heavy writes?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Bcdd8dafe2649a8a899fd18a6d99b6ff?d=identicon&s=25 Seth J. Morabito (Guest)
on 2007-01-10 06:55
(Received via mailing list)
Hi everyone,

We're running a fairly heavily used Rails app that uses ferret (and
acts_as_ferret) for search.  We're running on mongrel+Apache, Ruby
1.8.4, and ferret 0.10.13.  We're indexing a handful of attributes on
our "Image" and "User" models.

After the system has been running for several days, the index gradually
becomes corrupted, and ferret begins to segfault once the corruption is
bad enough.  We're running with ten mongrel servers balanced behind
Apache, so it takes a while before they all die due to the segfaulting.

The segfault spits out the following line into mongrel.log:

/usr/local/lib/ruby/gems/1.8/gems/ferret-0.10.13/lib/ferret/index.rb:271:
[BUG] Segmentation fault
ruby 1.8.4 (2005-12-24) [i686-linux]

I haven't done a ton of deeper investigation, but we suspect it may be
related to locking problems.  As I said, we're getting fairly heavy use,
and every time a user views an Image, a view counter is updated and the
Image is saved.  This causes acts_as_ferret to re-add the model to the
index, so the index is getting heavy write use.  As a side effect of
this, we see a lot of locking errors in the logs, which cause 500 error
for our users:

  Ferret::Store::Lock::LockError (Lock Error occured at <except.c>:103
in xpop_context
  Error occured in index.c:5368 - iw_open
          Couldn't obtain write lock when opening IndexWriter


Eventually, we start seeing corruption errors like these (as an
example):


  End-of-File Error occured at <except.c>:79 in xraise
  Error occured in compound_io.c:123 - cmpdi_read_i
          Tried to read past end of file. File length is <9> and tried
to read to <19>


And then boom, mongrel processes start to die, slowly.

IF the locking is leading to corruption problems, one thing that would
really help is if we didn't update the index on every write.  We're not
searching on the image view counter, so this might end up being more of
an acts_as_ferret question than a ferret question  (i.e., it'd be nice
to tell acts_as_ferret not to reindex the model if we're not updating an
attribute we search on!).  But that aside, has anyone else encountered
problems with heavy writing?

Thanks much,

-Seth
C9dd93aa135988cabf9183d3210665ca?d=identicon&s=25 Jens Kraemer (Guest)
on 2007-01-10 09:30
(Received via mailing list)
Hi!

On Tue, Jan 09, 2007 at 09:47:03PM -0800, Seth J. Morabito wrote:
> Hi everyone,
>
[..]
>
> IF the locking is leading to corruption problems, one thing that would
> really help is if we didn't update the index on every write.  We're not
> searching on the image view counter, so this might end up being more of
> an acts_as_ferret question than a ferret question  (i.e., it'd be nice
> to tell acts_as_ferret not to reindex the model if we're not updating an
> attribute we search on!).

if you're on aaf trunk, this is possible:

model_instance.disable_ferret  # will disable ferret for the next save
model_instance.save

or

model_instance.disable_ferret do # ferret is disabled for all saves
  model_instance.save            # occuring inside the block
end

> But that aside, has anyone else encountered problems with heavy
> writing?

Yes, we've had the very same errors in an application not using aaf.
Moveing all the indexing into a single backgroundrb process. Since then
everything is fine. I have a drb indexing feature for aaf in the works,
too.

cheers,
Jens


--
webit! Gesellschaft für neue Medien mbH          www.webit.de
Dipl.-Wirtschaftsingenieur Jens Krämer       kraemer@webit.de
Schnorrstraße 76                         Tel +49 351 46766  0
D-01069 Dresden                          Fax +49 351 46766 66
1c2737f395a0c70124bca79081d4e8e0?d=identicon&s=25 Ewout (Guest)
on 2007-01-10 10:39
(Received via mailing list)
Hi,

>
>IF the locking is leading to corruption problems, one thing that would
>really help is if we didn't update the index on every write.  We're not
>searching on the image view counter, so this might end up being more of
>an acts_as_ferret question than a ferret question  (i.e., it'd be nice
>to tell acts_as_ferret not to reindex the model if we're not updating an
>attribute we search on!).

acts_as_ferret(:fields => [:filename, :creator, ...])

With this you can control the fields that are indexed with ferret. It
will produce less overhead if you don't index fields you don't search
full-text.

Regards,
Ewout
This topic is locked and can not be replied to.