Forum: Rails Engines development Announcement: Indexed Search Engine 0.1.2 Available

D046cca1a33655b6285065ec89711389?d=identicon&s=25 Lance Ball (Guest)
on 2006-01-03 21:46
(Received via mailing list)
Hello all.

Apologies...  I was a little too eager in my earlier annoucement about
the Indexed Search Engine for Rails apps.  The DB migration file
contained an error that had to be worked around.  I've fixed that,
added more (and clearer) documentation, and a sample application.  You
can find most everything you want to know about Indexed Search Engine
here:

http://langwell-ball.com/indexed-search/

Indexed Search is a simple, pluggable engine for rails applications
which can be used to enable full text indexed searches within an
application. Searchable data is parsed, stemmed using the Porter
stemmer, and added to a fully indexed table. This allows you to index
things like "he runs fast" which will be returned from a search for
"running".

This message has been cross-posted to the Engine Developers, Engine
Users, and Rails mailing lists.

Best
Lance Ball
8b3a5fa50d63275c5c6e304f1a081bfb?d=identicon&s=25 Tim Lucas (Guest)
on 2006-01-03 22:19
(Received via mailing list)
On 04/01/2006, at 7:45 AM, Lance Ball wrote:

> http://langwell-ball.com/indexed-search/
>
> Indexed Search is a simple, pluggable engine for rails applications
> which can be used to enable full text indexed searches within an
> application. Searchable data is parsed, stemmed using the Porter
> stemmer, and added to a fully indexed table. This allows you to index
> things like "he runs fast" which will be returned from a search for
> "running".

I see in the API docs it says to make the index calls from the
controller. Would it not be better to do it from an ActiveRecord
Observer?

-- tim
30269682335f1fb247d71969fa715b5e?d=identicon&s=25 Roberto Saccon (rsaccon)
on 2006-01-03 22:22
(Received via mailing list)
How does it compare to ferret ? Just from the README it seems to be much
easier to setup and to use than ferret, but not as fast. If anybody has
experience with both, would be interesting to hear.
D046cca1a33655b6285065ec89711389?d=identicon&s=25 Lance Ball (Guest)
on 2006-01-03 22:34
(Received via mailing list)
On 1/3/06, Tim Lucas <t.lucas@toolmantim.com> wrote:
> I see in the API docs it says to make the index calls from the
> controller. Would it not be better to do it from an ActiveRecord
> Observer?
>

Hi Tim

Good question.  I thought about that - and in fact, in my first pass
at this it's what I did.  The problem I have with that approach is
that you then have to have knowledge of URIs in the ActiveRecord
classes, and that seemed to break the MVC paradigm.

It was also problematic when a single view had several different
active records in it.  For example, a single view that contains course
and instructor records for a university administration system may get
that data from two different active record types.  If you put the
calls to the indexer in the ActiveRecord classes, you then have to
make multiple calls to the indexer (one from each type).  The
controller is typically aware of both anyway, so it seems to make more
sense there.

When content is indexed, the indexer wants the content, the title, and
a URI to access the content, supplied via IndexableRecord::IndexData
(http://langwell-ball.com/distributions/indexed-sea...).
 To index multiple objects as a part of a single view, you can
concatenate the content from each active record, but still only
provide a single URI and title.

Of course, there's nothing stopping you from doing it with an
observer.  The API doens't care if it's called from the controller or
an active record.  Just add "include IndexedSearchEngine" to your
class.  Having the calls to the controller is just the convention that
I settled on for the reasons noted above.

Lance
D046cca1a33655b6285065ec89711389?d=identicon&s=25 Lance Ball (Guest)
on 2006-01-03 22:41
(Received via mailing list)
On 1/3/06, Roberto Saccon <rsaccon@gmail.com> wrote:
> How does it compare to ferret ? Just from the README it seems to be much
> easier to setup and to use than ferret, but not as fast. If anybody has
> experience with both, would be interesting to hear.

Ferret is almost certainly faster.  I believe IndexedSearchEngine is
easier to use, however.

I wrote it because I was doing a quick demo app for work and I didn't
want to deal with setting up and learning Ferret.  I too would be
interested in hearing about others' experience with both.

Lance
3bb23e7770680ea44a2d79e6d10daaed?d=identicon&s=25 M. Edward (Ed) Borasky (Guest)
on 2006-01-04 18:12
(Received via mailing list)
I haven't tried either of the Ruby-based index/search engines, but after
a bunch of testing, including big names like Lucene and Swish-e, I found
one that works well for me. I have a collection of something like 500
megabytes of PDF files (including the Pickaxe book and AWDWR :) ).

I've settled on Namazu. It's fully-configured out of the box for
handling PDFs and Word documents. IIRC it's mostly in Perl, so it could
probably be ported to Ruby fairly easily. And it can handle Japanese;
that was the author's original motivation for writing it, I think -- a
lack of usable Japanese-language search engines.

Lance Ball wrote:

>easier to use, however.
>
>
>

--
M. Edward (Ed) Borasky

http://linuxcapacityplanning.com
This topic is locked and can not be replied to.