Forum: Ruby on Rails ferret / hyperestraier indexing time

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
05dd15ca2006d3cbf8dc1c24de39eb63?d=identicon&s=25 Carmen -. (carmen)
on 2006-06-13 00:44
since i cant manage to get the accelerated ferret going (amd64
incompatible C code) and hyperestraier returns 0 results for everything
after a reboot (despite claiming its index is every bit as big as
before) i made a simple rails inverted index that essentially just does
a find_or_create_by_word for each word, and then adds its id to a join
table linking words and documents..


the only thing is it takes about 2 or 3 seconds to index a reasonably
large article, so this slows down 'add' operations, etc..

 ezra's backrounDRb sounds like it will hit the spot. but how does
acts_as_searchable and acts_as_ferret handle this. are they so much
faster than indexing time is moot?
Ezra Zygmuntowicz (Guest)
on 2006-06-13 01:00
(Received via mailing list)
On Jun 12, 2006, at 3:44 PM, carmen wrote:

> the only thing is it takes about 2 or 3 seconds to index a reasonably
> Rails@lists.rubyonrails.org
> http://lists.rubyonrails.org/mailman/listinfo/rails


Carmen-

	Building an index is exactly the kind of thing that backgroundrb is
great for. There are already some people already using it to build
their hyper estraier and ferret indexes. Join the mailing list[1] and
i can help you get the hang of how to use it. Eventually I want to
set up a small repo of user contributed worker classes for others to
use.

Cheers-
-Ezra

[1] http://rubyforge.org/mailman/listinfo/backgroundrb-devel
Phillip Kast (Guest)
on 2006-06-13 01:10
(Received via mailing list)
carmen <carmen@whats-your.name> wrote: since i cant manage to get the
accelerated ferret going (amd64
incompatible C code) and hyperestraier returns 0 results for everything
after a reboot (despite claiming its index is every bit as big as
before) i made a simple rails inverted index that essentially just does
a find_or_create_by_word for each word, and then adds its id to a join
table linking words and documents..


the only thing is it takes about 2 or 3 seconds to index a reasonably
large article, so this slows down 'add' operations, etc..

 ezra's backrounDRb sounds like it will hit the spot. but how does
acts_as_searchable and acts_as_ferret handle this. are they so much
faster than indexing time is moot?
I'm using hyper estraier. W/ about 20K articles in the index, on my dev
box with tons of other processes running, here's sample performance:

>> a.body.split(' ').size
=> 1382
>> t1 = Time.now; a.update_index(true); Time.now - t1
=> 1.150097

Not exactly lightning fast, but not a deal breaker for me as inserts are
relatively infrequent.
BTW, sounds like either your app is looking for the wrong HE node, or
you had a corrupted index. Have you had that "can't find anything"
problem come up multiple times? I haven't had any trouble in testing.

phil
This topic is locked and can not be replied to.