Sphinx vs ferret

Running memory hungry/long running processes in shared hosting
environments
is sure to get you kicked out quickly. Is there something that fulltext
mysql indexes won’t give you that you desperately need? If MySQL won’t
cut
it then you probably need to move into a VPS.

AEM

On Wed, Mar 26, 2008 at 2:36 PM, Harry S. <
[email protected]> wrote:

Posted via http://www.ruby-forum.com/.


Adrian Esteban Madrid
Lead Developer, Prefab Markets
http://www.prefabmarkets.com

@Adrian

“Is there something that fulltext mysql indexes won’t give you that you
desperately need? If MySQL won’t cut it then you probably need to move
into a VPS.”

Well that is a good question I was wondering about. And basically the
answer is that it was so easy to run aaf that it is a pity to go without
it to search in different models, for different fields.

By the way I do not really understand why ferret could not use the db to
write its index (performance issue?). At least the db knows how not to
corrupt a file system.

H

Hey I have used Ferret, Sphinx and Solr all three of them in development
as well as production environments.

If you want to stay out of all this debate about which search engine to
use, avoid troubleshooting your search feature and make it zero
maintenance and still get a great speed at indexing and searching(pros
and cons), I would suggest you to go for Solr and acts_as_solr plugin.

I have compiled some points that I came across during my experience with
RoR till date.

Ferret:-
Advantages:

  1. Easy to implement.
  2. Indexing on ActiveRecord save - It hooks up with the life cycle of an
    object.
    Disadvantages:-
  3. Corrupts indexes if used with Transactions in your apps because of
    its after_update filter.(It updates the index before the actual save to
    the database)
  4. Unstable on the production server if you use some load balancing
    techniques like round-robbin scheme and you have instances of mongrel on
    different machines.
    (Added burden to use a separate dRB server)
  5. Faster at indexing but slower at searching.

Sphinx:-
Advantages:-

  1. Great at speed of indexing and searching.
  2. Its at the database level so just one copy of indexes unlike ferret.
    Disadvantages:-
  3. Difficult to integrate as compared to Ferret or Solr.
  4. You have to write a lot of sql code in the configuration file for
    indexing and searching data.
  5. Not hooked with the ActiveRecord save or the life cycle of an object,
    so you need a cron job to rebuild the index periodically.

Solr:-
Advantages:-

  1. Easy to implement
  2. Runs on a separate Java server(Solr server), so just one copy of
    indexes.
  3. Hooked up with the object life cycle, so index update with
    ActiveRecord save.
  4. Good speed at indexing and searching
  5. No gem required, no engine installation…just get the
    Acts_as_solr plugin.
  6. In-built support for highlighting search keywords like you see in
    Google Search and many more advanced features.
  7. NONE of the disadvantages mentioned above
    Disadvantages:-
  8. It costs you just some extra memory but not an unbearable amount
    though.(I would say that now-a-days memory is cheaper, so you can afford
    it)

I personally would suggest you to go for Acts_As_Solr plugin.

You could also refer to the following links:-

http://blog.aisleten.com/2007/04/14/getting-started-with-acts_as_solr/

If you decide to use Acts_as_solr on windows, this would be helpful:-

On 03/04/2008, at 4:55 PM, Adhiraj Rankhambe wrote:

Sphinx:-
Advantages:-

  1. Great at speed of indexing and searching.
  2. Its at the database level so just one copy of indexes unlike
    ferret.
    Disadvantages:-
  3. Difficult to integrate as compared to Ferret or Solr.

Arguable, but each to their own.

  1. You have to write a lot of sql code in the configuration file for
    indexing and searching data.

This very much depends on the plugin you use. I’m reasonably sure this
isn’t required for Ultrasphinx, and it’s definitely not for Thinking
Sphinx (my own plugin, as mentioned earlier in this thread - yes, I’ve
got some level of bias).

  1. Not hooked with the ActiveRecord save or the life cycle of an
    object,
    so you need a cron job to rebuild the index periodically.

Yeah, that’s pretty much true. Both of the above plugins support delta
indexes, so model changes are automatically put into the live indexes,
but regular periodic reindexing is still needed.

Solr:-

Disadvantages:-

  1. It costs you just some extra memory but not an unbearable amount
    though.(I would say that now-a-days memory is cheaper, so you can
    afford
    it)
  1. It’s Java - which is extra overhead for some people - I certainly
    don’t use any other Java tools, and I’ve not dealt with Java since
    Uni. Again, each to their own, but that may push non-Java people away
    from Solr.

Cheers


Pat
e: [email protected] || m: 0413 273 337
w: http://freelancing-gods.com || p: 03 9386 0928
discworld: http://ausdwcon.org || skype: patallan

  1. It’s Java - which is extra overhead for some people - I certainly
    don’t use any other Java tools, and I’ve not dealt with Java since
    Uni. Again, each to their own, but that may push non-Java people away
    from Solr.

You don’t need to know Java to use the acts_as_solr plugin.

You just install the plugin and build the index for the first time.
Thenonwards, you just have to start the solr server by issuing a
command:
rake solr:start.

Now tell me where’s Java?

That’s pretty much it.

Indeed, but unfortunately I am under MySQL and not willing to change
right now…

About ferret on a shared host there is this solution which could be a
temporary solution.
http://boonedocks.net/mike/archives/151-Rails-acts_as_ferret-without-DRb.html

H

Peter V. wrote:

Adrian M. wrote:

Don’t even try running ferret on a shared host. I don’t think you really
have any other option but MySQL fulltext indexes in a shared hosting
environment.

You might take a look at tsearch2 on postgresql (for a shared host
solution).
IIRC, it only requires special indexes in the database, but no daemon
process (like e.g. sphinx does). This was mentioned higher up in this
thread too, by
Ericson S…

I did some experiments with tsearch2 and it worked OK (but then I
switched to sphinx, mainly because MySQL is more common as a Rails
back-end and because a clean and full plug-in (Ultrasphinx) was
available). In older versions of Postgresql it is a plug-in, since 8.2
(IIRC) it is built-in by default.

HTH,

Peter

On 10/04/2008, at 7:48 PM, Adhiraj Rankhambe wrote:

rake solr:start.

Now tell me where’s Java?

That’s pretty much it.

Sorry - first off, I have complete ignorance about acts_as_solr and
Solr. My Java comment was in reference to the latter though, since you
mentioned you can run it on ‘a separate Java server’, I assumed that’s
all it runs in.

Ultrasphinx works only on Rails 2.0.
Granted, this is a problem if you’re shoehorning Sphinx into an
existing app - but I’m guessing most people starting new projects
would be using 2.0 (or even edge in preparation for 2.1?)

Cheers


Pat

  1. You have to write a lot of sql code in the configuration file for
    indexing and searching data.

This very much depends on the plugin you use. I’m reasonably sure this
isn’t required for Ultrasphinx, and it’s definitely not for Thinking
Sphinx (my own plugin, as mentioned earlier in this thread - yes, I’ve
got some level of bias).

Ultrasphinx works only on Rails 2.0.