Forum: Ferret Short words not indexed?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Jennyw J. (Guest)
on 2005-12-29 02:16
(Received via mailing list)
I noticed that if I have a field that contains something like "Institute
for medicine", that if I search using nay of these queries:

for
*for*
for~

Nothing shows up. If I search for either of the other two words, though,
that term would show up in the result set. Does this indicate that short
words like "for" are not indexed?

Thanks!

Jen
Erik H. (Guest)
on 2005-12-29 02:31
(Received via mailing list)
What analyzer are you using?
On Dec 28, 2005, at 7:13 PM, jennyw wrote:
> that term would show up in the result set. Does this indicate that
> short
> words like "for" are not indexed?

Jen - what analyzer are you using?

If you're using the default, it is the StandardAnalyzer, which
removes these stop words during tokenization:

     ENGLISH_STOP_WORDS = [
       "a", "an", "and", "are", "as", "at", "be", "but", "by", "for",
"if",
       "in", "into", "is", "it", "no", "not", "of", "on", "or", "s",
"such",
       "t", "that", "the", "their", "then", "there", "these",
       "they", "this", "to", "was", "will", "with"
     ]

Off the cuff, you should be able to adjust this to not remove any
stop words by using:

	:analyzer => StandardAnalyzer.new([])

if you're using the Index class Ferret provides.

	Erik
Cameron H. (Guest)
on 2006-08-24 00:59
Erik H. wrote:

> 	:analyzer => StandardAnalyzer.new([])
>


I am having a similar problem, and i've tried implementing your
suggestion like this:

  acts_as_ferret  :analyzer =>
Ferret::Analysis::StandardAnalyzer.new([])

in my indexed classes.

I have rebuilt the indexes, but still can't get some of these short
words to return results.

I've also found that words which have hyphens in them don't work.

Is there something else necessary in order to get this working in an
active record class?

Thanks,

Cam
David B. (Guest)
on 2006-08-24 03:54
(Received via mailing list)
On 8/24/06, Cameron H. <removed_email_address@domain.invalid> wrote:
> Ferret::Analysis::StandardAnalyzer.new([])
>
> Thanks,
>
> Cam

Hi Cam,

Get Ferret 0.9.6. This was a bug which should now be fixed. Better
yet, wait until acts_as_ferret works with Ferret 0.10. Jens K. is
already working on it and I have few bugs to work out. As for hyphens
not working, it sounds like the same analyzer is not being used for
the queries but you'd have to check with one of the acts_as_ferret
developers that you are using it correctly.

Cheers,
Dave
Cameron H. (Guest)
on 2006-08-24 19:34
David B. wrote:
> Get Ferret 0.9.6. This was a bug which should now be fixed. Better
> yet, wait until acts_as_ferret works with Ferret 0.10. Jens K. is
> already working on it and I have few bugs to work out. As for hyphens
> not working, it sounds like the same analyzer is not being used for
> the queries but you'd have to check with one of the acts_as_ferret
> developers that you are using it correctly.


I am currently using 0.9.6 after reading about the update somewhere
else.  I didn't have luck switching the analyzer, so i tried the
alternate solution suggested elsewhere on this forum, which was just
stripping the STOP words out of the original query.  This actually seems
to work somewhat well, but doesn't solve the hypen problem.

Can you clarify what you mean about using the same analyzer for queries?
Is there any reason that hyphenated terms would not be getting indexed
and searched properly by default?

Thanks

Cam
David B. (Guest)
on 2006-08-25 18:43
(Received via mailing list)
On 8/25/06, Cameron H. <removed_email_address@domain.invalid> wrote:
> else.  I didn't have luck switching the analyzer, so i tried the
> alternate solution suggested elsewhere on this forum, which was just
> stripping the STOP words out of the original query.  This actually seems
> to work somewhat well, but doesn't solve the hypen problem.
>
> Can you clarify what you mean about using the same analyzer for queries?
> Is there any reason that hyphenated terms would not be getting indexed
> and searched properly by default?

This was in fact a bug. Thanks Cameron. It is fixed in subversion now.

Cheers,
Dave
This topic is locked and can not be replied to.