I am reaching out to get some help/ideas about our current search
implementation.
We are currently using Ferret as our search engine and its not keeping
up with our needs… we are quite needy ;). We have about 5m+ documents
that total around 8-10gb of data and indexing can take up to a half a
day which is obviously not where we want to be.
I’ve been looking into Solr and Sphinx and I’m am wondering if anyone
would have any thoughts to which would be suite our needs. What are the
pros/cons of each?
Another search related question I have is how would one go about
classifying documents based on content? Does Solr or Sphinx have this
capability? Even better question is how do I go about displaying
categories (and their counts) for a specific search term? For example if
I search for ‘hat’ I want a category browse like the following:
pros/cons of each?
We would have to know what your needs are for that. So far I have
understood that you have a particular volume of files you want to text
index but and search. I have no idea what kinds of documents you have
and what searches you want to be able to do (only words, combinations
etc.). Even with your life example I’m not really seeing things clearer
(might be due to my lack of knowledge about Ferret and their website
being unresponsive to me).
Another search related question I have is how would one go about
classifying documents based on content? Does Solr or Sphinx have this
capability?
Thanks for the reply. As for your question on what type of documents we
are indexing the answer is its very similar to that site I mentioned or
an Ebay like site.
So there are a bunch of items that people are selling/buying and I would
like a user to be able to search those items against title, description,
keywords etc.
pros/cons of each?
We would have to know what your needs are for that. So far I have
understood that you have a particular volume of files you want to text
index but and search. I have no idea what kinds of documents you have
and what searches you want to be able to do (only words, combinations
etc.). Even with your life example I’m not really seeing things clearer
(might be due to my lack of knowledge about Ferret and their website
being unresponsive to me).
Another search related question I have is how would one go about
classifying documents based on content? Does Solr or Sphinx have this
capability?
Thanks for the reply. As for your question on what type of documents we
are indexing the answer is its very similar to that site I mentioned or
an Ebay like site.
So there are a bunch of items that people are selling/buying and I would
like a user to be able to search those items against title, description,
keywords etc.
Then you shouldn’t be searching ‘documents’ at all, you should have
those attributes as separate fields in the DB with proper indexes.
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.