Scaling Ferret Beyond One Server

Hi Everyone,

I was wondering if folks here have had experience scaling Ferret beyond
single server? Currently, we are running Ferret in the same physical
as its Rails front end (via acts_as_ferret), but it is evident that we
a more scalable solution already. How would you split up the tasks (via
perhaps?) between two or three servers? Shared disk, replicated Ferret
index (?), or any other ideas?

Thanks in advance,

On 7/15/06, Andy C. [email protected] wrote:

Hi Andy,

I guess the answer depends on which part of the application is the
bottleneck. If it is Ferret then replicating the index might be the
solution but it’s complicated and I doubt that is your problem.

If Ferret is handling the workload (which it should be if you have the
C extension installed) then my guess would be to use a DRb solution.
In a few weeks I’m going to start experimenting with using Ferret with
DRb and future versions may even come with a DRb server included. In
the mean time let me know how you go.



Thanks for your feedback and for developing the wonderful Ferret!

Besides performance, our application requirement is to have no single
of failure - which is why we are looking at running Ferret (at least the
search node) beyond a single server.

In the lucene world, there’s an interesting post at[email protected]/msg12709.html
how Technorati is doing distributed Lucene…

Our current options are (1) dRB, (2) some replication technique similar
the one described by Doug Cutting in the above post, and (3) possibly
form of distributed file system like hadoop (which will also serve other
needs for our app). Will let the list know how it goes. Also,
in hearing anybody else’ experience on using Ferret on more than one


Andy C. wrote:

(3) possibly some form of distributed file system like hadoop

Actually hadoop is build for distributed filesystem that only need
sequential reading of files. Its not useful for random access. You might
want to try something like MogileFS instead.