Hello all. Apologies... I was a little too eager in my earlier annoucement about the Indexed Search Engine for Rails apps. The DB migration file contained an error that had to be worked around. I've fixed that, added more (and clearer) documentation, and a sample application. You can find most everything you want to know about Indexed Search Engine here: http://langwell-ball.com/indexed-search/ Indexed Search is a simple, pluggable engine for rails applications which can be used to enable full text indexed searches within an application. Searchable data is parsed, stemmed using the Porter stemmer, and added to a fully indexed table. This allows you to index things like "he runs fast" which will be returned from a search for "running". This message has been cross-posted to the Engine Developers, Engine Users, and Rails mailing lists. Best Lance B.
on 2006-01-03 22:46
on 2006-01-03 23:19
On 04/01/2006, at 7:45 AM, Lance B. wrote: > http://langwell-ball.com/indexed-search/ > > Indexed Search is a simple, pluggable engine for rails applications > which can be used to enable full text indexed searches within an > application. Searchable data is parsed, stemmed using the Porter > stemmer, and added to a fully indexed table. This allows you to index > things like "he runs fast" which will be returned from a search for > "running". I see in the API docs it says to make the index calls from the controller. Would it not be better to do it from an ActiveRecord Observer? -- tim
on 2006-01-03 23:22
How does it compare to ferret ? Just from the README it seems to be much easier to setup and to use than ferret, but not as fast. If anybody has experience with both, would be interesting to hear.
on 2006-01-03 23:34
On 1/3/06, Tim L. <email@example.com> wrote: > I see in the API docs it says to make the index calls from the > controller. Would it not be better to do it from an ActiveRecord > Observer? > Hi Tim Good question. I thought about that - and in fact, in my first pass at this it's what I did. The problem I have with that approach is that you then have to have knowledge of URIs in the ActiveRecord classes, and that seemed to break the MVC paradigm. It was also problematic when a single view had several different active records in it. For example, a single view that contains course and instructor records for a university administration system may get that data from two different active record types. If you put the calls to the indexer in the ActiveRecord classes, you then have to make multiple calls to the indexer (one from each type). The controller is typically aware of both anyway, so it seems to make more sense there. When content is indexed, the indexer wants the content, the title, and a URI to access the content, supplied via IndexableRecord::IndexData (http://langwell-ball.com/distributions/indexed-sea...). To index multiple objects as a part of a single view, you can concatenate the content from each active record, but still only provide a single URI and title. Of course, there's nothing stopping you from doing it with an observer. The API doens't care if it's called from the controller or an active record. Just add "include IndexedSearchEngine" to your class. Having the calls to the controller is just the convention that I settled on for the reasons noted above. Lance
on 2006-01-03 23:41
On 1/3/06, Roberto S. <firstname.lastname@example.org> wrote: > How does it compare to ferret ? Just from the README it seems to be much > easier to setup and to use than ferret, but not as fast. If anybody has > experience with both, would be interesting to hear. Ferret is almost certainly faster. I believe IndexedSearchEngine is easier to use, however. I wrote it because I was doing a quick demo app for work and I didn't want to deal with setting up and learning Ferret. I too would be interested in hearing about others' experience with both. Lance
on 2006-01-04 19:12
I haven't tried either of the Ruby-based index/search engines, but after a bunch of testing, including big names like Lucene and Swish-e, I found one that works well for me. I have a collection of something like 500 megabytes of PDF files (including the Pickaxe book and AWDWR :) ). I've settled on Namazu. It's fully-configured out of the box for handling PDFs and Word documents. IIRC it's mostly in Perl, so it could probably be ported to Ruby fairly easily. And it can handle Japanese; that was the author's original motivation for writing it, I think -- a lack of usable Japanese-language search engines. Lance B. wrote: >easier to use, however. > > > -- M. Edward (Ed) Borasky http://linuxcapacityplanning.com