Newcomer perceived problems with AAF/Ferret

I blogged about some of the problems with aaf in production
(Best Online Casino in Australia | Top Licensed Casinos for Gamblers) yesterday but inspired by the
poitive response I thought I would share the perceived problems and
discuss some of the potential solutions to help newcomers and make aaf
work as if by magic just like Rails does.

All of these problems boil down to one simple problem, running acts as
ferret in a production environment simply is not consistent with the
rails behaviour. Rails is single threaded system, resulting in multiple
processes on a single server. Even the basic deployment scenario of
Mongrel + Pen/Pound/Balance + Rails you have multiple processes. AAF in
this case breaks without Drb, I don’t think that how you develop should
be different to how its run in production. Rails doesn’t need a thing
changing to work and that is the root cause of the production problem
with aaf/ferret.

  1. Each model needs remote => true adding to it for remoting to work.
    This is a breach of the DRY principle.
    Solution: Get rid of this requirement, if the config is there then use
    it, if its not then don’t.

  2. aaf needs to be installed as a plugin to get the start/stop scripts.
    Solution: Add startFerret, stopFerret Drb scripts to the gem so that
    even if it is not installed as a plugin the gem is still useable.

  3. Drb server is a central point of failure.
    Solution 1: Lucence locks the file while it works which effectively
    allows many threads to write but it serialises them. This should be the
    case with aaf as well, at least then basic parrallism will work without
    having to setup up Drb or any other config settings. Sure you could
    get fancy with partial locking schemes but right now what I think I care
    about is it at least work. Drb should be the next performance step from
    Solution 1, improving performance at the cost of reliability.

Solution 2: Support clustered behaviour of the Drb servers, write to all
read from one just like a database.

Solution 3: Allow the index to be stored in the database and use
transactions to manage the updates (Just as compass does to lucene)

Solution 4: You could write a multicast clustered behaviour system to
make only one of the rails servers write to the file while all the
others read from it. Probably more complex a solution than is needed.

  1. AAF fails unexpectedly if you just use it without Drb.
    Solution: Produce a warning if more than one thread/process seems to be
    updating the file at any one time. Although bare in mind Solution 1 for
    problem (3) solves this in a different way and is a better solution
    overall.

Thoughts?

Hey …

thanks for your suggestions … as a quick background, i’m one of
the authors of omdb.org[2], Jens and I wrote the search backend
for omdb, some parts of that backend is now part of AAF …
But omdb.org still has its own search logic… basically because
searching is a complex task. I don’t see a plugin, that
automagically resolves all the search related topic… however
some of your points are valid. i’ll give my 2cent to your
suggestions, even though, omdb.org does not use AAF :slight_smile:

  1. Each model needs remote => true adding to it for remoting to work.
    This is a breach of the DRY principle.
    Solution: Get rid of this requirement, if the config is there then use
    it, if its not then don’t.

thats true… basically you just need :remote => false in testing…
at least thats how omdb is doing it …

from
Solution 1, improving performance at the cost of reliability.

omdb uses a queue to serialize all indexing requests. We also have
two indizes, one offline index (new index requests will be added to
that), and on online index (used for searching). As soon as our queue
is empty or we processed 256 indexing requests, we’ll sync online and
offline index. This eliminates all of our search/indexing problems and
is running very stable. Afaik, AAF is using backgroundrb for searching,
this is something i would change, as i don’t see a problem having
the mongrel processes using a R/O access to the index (like the Searcher
class).

Solution 3: Allow the index to be stored in the database and use
transactions to manage the updates (Just as compass does to lucene)

That’s an interesting point, and Jens and I even talked about that
some month ago. Basically it is possible to write a new StorageEngine
for Ferret (it just supports Memory and FS storage right now), but
never got the time to do it. I’m fine with the online/offline index
concept
in the filesystem. This can even be distributed to several servers via
rsync. But a db-backend is an option.

  1. AAF fails unexpectedly if you just use it without Drb.
    Solution: Produce a warning if more than one thread/process seems
    to be
    updating the file at any one time. Although bare in mind Solution 1
    for
    problem (3) solves this in a different way and is a better solution
    overall.

I agree with your point from earlier on, I would simply eliminiate the
non-drb option :slight_smile: Just use non-drb for testing …

btw, the whole omdb/ferret integration is open source, so feel free
to take a look[2].

Ben

[1] http://www.omdb.org
[2] http://bugs.omdb.org/browser/branches/2007.1/lib/omdb/ferret

Hi!

On Tue, Sep 11, 2007 at 01:45:05PM +0200, Paul K. wrote:

Mongrel + Pen/Pound/Balance + Rails you have multiple processes. AAF in
this case breaks without Drb, I don’t think that how you develop should
be different to how its run in production. Rails doesn’t need a thing
changing to work and that is the root cause of the production problem
with aaf/ferret.

Putting an app live is always something that needs special attention,
and the DRb server is by far not the only thing you’ll need to set up
properly on your live site. Throw in memcached, monit, mongrel_cluster,
log rotation, load balancing, cron jobs and suddenly it’s not that much
additional work anymore.

  1. Each model needs remote => true adding to it for remoting to work.
    This is a breach of the DRY principle.
    Solution: Get rid of this requirement, if the config is there then use
    it, if its not then don’t.

I completely agree with you. The reason it is like that is more or less
backwards compatiblity and the decision to make the DRb server an
option, and not the default in the beginning, when it still was nothing
more than an experiment. Plus nobody complained about it until now :wink:

As I already wrote, I’ll change that soon.

  1. aaf needs to be installed as a plugin to get the start/stop scripts.
    Solution: Add startFerret, stopFerret Drb scripts to the gem so that
    even if it is not installed as a plugin the gem is still useable.

good point, however I think an install script that adds these scripts
and a template config file to a given rails project would be ok, too.

  1. Drb server is a central point of failure.
    Solution 1: Lucence locks the file while it works which effectively
    allows many threads to write but it serialises them. This should be the
    case with aaf as well, at least then basic parrallism will work without
    having to setup up Drb or any other config settings. Sure you could
    get fancy with partial locking schemes but right now what I think I care
    about is it at least work. Drb should be the next performance step from
    Solution 1, improving performance at the cost of reliability.

Ferret’s inter process locking doesn’t work. Therefore Ferret is thread
safe, but not multi-process safe. And I’m afraid that’s unlikely to
change anytime soon (unless you’d like to give it a try).
I’m not sure where exactly on the way from Java to Ruby to C/Ruby this
capability got lost…

DRb based solutions have proven to be stable enough, so I don’t consider
this a real issue. Yes, you need to start the server in production
environments and not in dev mode (unless you feel like it, nothing stops
you from using the DRb in dev mode, too), but I think that’s something
people can handle. Of course we could provide more support for this,
i.e. monit configs and cap tasks for controlling the DRb server. For
starters, I changed the Wiki page to make really clear that the DRb
server is required for multi-process environments.

Another point why I wouldn’t want to invest time in file based index
access is that with multiple physical servers, file based index access
won’t be worth anything and DRb is your only option again.

Solution 2: Support clustered behaviour of the Drb servers, write to all
read from one just like a database.

I guess you mean ‘write to one and read from all’? Interesting and kind
of what Ben does at omdb.

Solution 3: Allow the index to be stored in the database and use
transactions to manage the updates (Just as compass does to lucene)

This is definitely an option if somebody wrote a database backend for
Ferret. Performance would be interesting. Never heard about that compass
thing, do you know how it performs when compared with Lucene’s file
system storage?

Solution 4: You could write a multicast clustered behaviour system to
make only one of the rails servers write to the file while all the
others read from it. Probably more complex a solution than is needed.

Of course we could do that :wink:

In general I see aaf more as a general-purpose fulltext search plugin
which lets you get up and running fast, than the last and best solution
to every problem that might get solved with Ferret. For sure there are
applications that profit from having their own, tailored search
implementation around Ferret. However in these cases aaf may still be a
good starting point from where a custom solution may evolve.

  1. AAF fails unexpectedly if you just use it without Drb.
    Solution: Produce a warning if more than one thread/process seems to be
    updating the file at any one time. Although bare in mind Solution 1 for
    problem (3) solves this in a different way and is a better solution
    overall.

Imho this isn’t an issue anymore once remote mode is the default.

cheers,
Jens


Jens Krämer
http://www.jkraemer.net/ - Blog
http://www.omdb.org/ - The new free film database