Ferret or not ferret?

hi, i’ve to choose a search engine for a medium-big site with a lot of
searches and inserts at the same moment, do you suggest me something?
i’m thinking about ferret, but i read that it has some problems with
this king of “work” :frowning:

On Thu, Mar 01, 2007 at 08:18:32PM +0100, mix wrote:

hi, i’ve to choose a search engine for a medium-big site with a lot of
searches and inserts at the same moment, do you suggest me something?
i’m thinking about ferret, but i read that it has some problems with
this king of “work” :frowning:

Ferret recently had several improvements in this area (see Dave’s recent
posts about the recent release candidates).

Even if you still should experience problems with multiple processes
accessing the index you can always set up a simple DRb server doing the
indexing/search work.

Or you can have a look at acts_as_ferret, which has such a server
already built in. Not to mention the fact that acts_as_ferret would make
the integration of Ferret-based full text search into your app a
one-liner
:slight_smile:

Jens


Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

On 3/2/07, mix [email protected] wrote:

hi, i’ve to choose a search engine for a medium-big site with a lot of
searches and inserts at the same moment, do you suggest me something?
i’m thinking about ferret, but i read that it has some problems with
this king of “work” :frowning:

Ferret is getting better and better at this. The latest version still
has a couple of bugs but the current working version is very stable
with multiple processes accessing the index. I’ve just stress tested
it with 10 search processes and 1 writer process for 24hours without
any problems. I will definitely have this release out before Monday. I
think the next version would be perfect for what you are talking
about.

solrb is also a good option although it will be a little slower and
you’ll have to run java on your server (not that this is a big deal).

cut

ok :slight_smile:
another question about ferret, is it possible to do 2 kind of search?
normal (which include the text to search and another field) and advanced
(which has more option to select, part or all of them) ?

On Fri, Mar 02, 2007 at 01:44:45PM +0100, mix wrote:

cut

ok :slight_smile:
another question about ferret, is it possible to do 2 kind of search?
normal (which include the text to search and another field) and advanced
(which has more option to select, part or all of them) ?

that’s no problem at all, you can build very complex and field-specific
queries as well as issuing a simple ‘give me all docs where term xyz is
in
any field’ query.

Jens


Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

Jens K. wrote:

On Fri, Mar 02, 2007 at 01:44:45PM +0100, mix wrote:

cut

ok :slight_smile:
another question about ferret, is it possible to do 2 kind of search?
normal (which include the text to search and another field) and advanced
(which has more option to select, part or all of them) ?

that’s no problem at all, you can build very complex and field-specific
queries as well as issuing a simple ‘give me all docs where term xyz is
in
any field’ query.

Jens


Jens Kr�mer
webit! Gesellschaft f�r neue Medien mbH
Schnorrstra�e 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

perfect, i think i’ll go with ferret and act_as_ferret :slight_smile:
i’ve also found this:
http://www.railsenvy.com/2007/2/19/acts-as-ferret-tutorial seems very
good :slight_smile:
thanks

On Mar 1, 2007, at 2:18 PM, mix wrote:

hi, i’ve to choose a search engine for a medium-big site with a lot of
searches and inserts at the same moment, do you suggest me something?
i’m thinking about ferret, but i read that it has some problems with
this king of “work” :frowning:

I was lurking on this thread until Dave mentioned solrb. First of
all, I love Ferret. Dave is amazing, and the performance is
fantastic. I have been groping for a Lucene in Ruby for a long time,
even starting to tinker with it at a low-level pure Ruby way myself.

When Solr came along I knew this hit the sweet spot I was looking
for. It’s all the greatness of Java Lucene, which is continually and
rapidly being improved by many folks. Above and beyond just wrapping
Lucene behind an HTTP interface, it adds a ton of great features on
top: caching, replication, faceting, highlighting, and an incredibly
active community. My expertise is in Java Lucene, so it felt right
to me. We’ve started a project called solr-ruby (used to be named
solrb, but we renamed it to be more readable and pronounceable) which
provides a Ruby API to Solr. For example (from <http://
Solr-ruby - Solr - Apache Software Foundation>):

connect to the solr instance

conn = Connection.new(‘http://localhost:8983/solr’, :autocommit
=> :on)

add a document to the index

conn.add(:id => 123, :title_text => ‘Lucene in Action’)

update the document

conn.update(:id => 123, :title_text => ‘Solr in Action’)

print out the first hit in a query for ‘action’

response = conn.query(‘action’)
print response.hits[0]

iterate through all the hits for ‘action’

conn.query(‘action’) do |hit|
puts hit.inspect
end

delete document by id

conn.delete(123)

On top of solr-ruby, we’ve also been building Solr Flare, a Rails-
based front-end that presents a faceted and full-text search
interface, including integration with SIMILE Exhibit and Timeline,
and eventually also having Atom feeds, saved searches, etc.

While I certainly don’t want to steal any thunder from Ferret,
because I think it is a great project, I feel compelled on this
thread to bring up what I consider a top-notch alternative to Ferret.

It would be very interesting to run some benchmarks comparing the two
at a few levels: indexing speed, plain full-text query speed, and
also most important to my work, the speed of generating facet
information along with a query.

Erik

just a last question :slight_smile:
for example, there is a book named “best of open source”, if i search
something like “source open” or “best source” or “source best” etc,
ferret find them, isn’t it?

On Fri, Mar 02, 2007 at 06:18:14PM +0100, mix wrote:

just a last question :slight_smile:
for example, there is a book named “best of open source”, if i search
something like “source open” or “best source” or “source best” etc,
ferret find them, isn’t it?

Usually it will. You can however construct queries that take the order
of query terms into account, if you need that.

Jens


Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

Jens K. wrote:

Usually it will. You can however construct queries that take the order
of query terms into account, if you need that.

Jens

perfect :slight_smile: ok ok, just the last one, and about the case sensitive? if
i’ve a book “Open SOURCE”, with a search “source” will it find it ?
thanks :slight_smile:

David B. wrote:

On 3/3/07, mix [email protected] wrote:
Yes. You can do both case sensitive and case insensitive searches in
Ferret depending on how you setup your analyzer but searches are case
insensitive by default so a search for “source” will find “SOURCE”.

perfect :slight_smile:
just the last question, i promise :slight_smile:
with an index of 5-10gb how does it work? because i’ve to save some
information in the index to use the highlight and do any query

On 3/3/07, mix [email protected] wrote:

thanks :slight_smile:
Yes. You can do both case sensitive and case insensitive searches in
Ferret depending on how you setup your analyzer but searches are case
insensitive by default so a search for “source” will find “SOURCE”.

On Sat, Mar 03, 2007 at 01:10:22PM +0100, marco wrote:

David B. wrote:

On 3/3/07, mix [email protected] wrote:
Yes. You can do both case sensitive and case insensitive searches in
Ferret depending on how you setup your analyzer but searches are case
insensitive by default so a search for “source” will find “SOURCE”.

perfect :slight_smile:
just the last question, i promise :slight_smile:
with an index of 5-10gb how does it work? because i’ve to save some
information in the index to use the highlight and do any query

Try it out :slight_smile: I didn’t use such a large index yet, but I think Ferret
will be able to handle it just fine.

Jens


Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

Jens K. wrote:

Try it out :slight_smile: I didn’t use such a large index yet, but I think Ferret
will be able to handle it just fine.

i hope to achieve that dimension :slight_smile: