Forum: Ferret Short words not indexed?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
114a64bf4e0cb600120f348b6447c258?d=identicon&s=25 Jennyw Jennyw (jennyw)
on 2005-12-29 01:16
(Received via mailing list)
I noticed that if I have a field that contains something like "Institute
for medicine", that if I search using nay of these queries:

for
*for*
for~

Nothing shows up. If I search for either of the other two words, though,
that term would show up in the result set. Does this indicate that short
words like "for" are not indexed?

Thanks!

Jen
4d6a47158a7c8a032e5f6a4da8976d7d?d=identicon&s=25 Erik Hatcher (Guest)
on 2005-12-29 01:31
(Received via mailing list)
What analyzer are you using?
On Dec 28, 2005, at 7:13 PM, jennyw wrote:
> that term would show up in the result set. Does this indicate that
> short
> words like "for" are not indexed?

Jen - what analyzer are you using?

If you're using the default, it is the StandardAnalyzer, which
removes these stop words during tokenization:

     ENGLISH_STOP_WORDS = [
       "a", "an", "and", "are", "as", "at", "be", "but", "by", "for",
"if",
       "in", "into", "is", "it", "no", "not", "of", "on", "or", "s",
"such",
       "t", "that", "the", "their", "then", "there", "these",
       "they", "this", "to", "was", "will", "with"
     ]

Off the cuff, you should be able to adjust this to not remove any
stop words by using:

	:analyzer => StandardAnalyzer.new([])

if you're using the Index class Ferret provides.

	Erik
B4d2ff48e40d96385900a4c943ce0899?d=identicon&s=25 Cameron Hickey (cameronhickey)
on 2006-08-23 22:59
Erik Hatcher wrote:

> 	:analyzer => StandardAnalyzer.new([])
>


I am having a similar problem, and i've tried implementing your
suggestion like this:

  acts_as_ferret  :analyzer =>
Ferret::Analysis::StandardAnalyzer.new([])

in my indexed classes.

I have rebuilt the indexes, but still can't get some of these short
words to return results.

I've also found that words which have hyphens in them don't work.

Is there something else necessary in order to get this working in an
active record class?

Thanks,

Cam
B5e329ffa0cc78efbfc7ae2d084c149f?d=identicon&s=25 David Balmain (Guest)
on 2006-08-24 01:54
(Received via mailing list)
On 8/24/06, Cameron Hickey <rails@thepattern.net> wrote:
> Ferret::Analysis::StandardAnalyzer.new([])
>
> Thanks,
>
> Cam

Hi Cam,

Get Ferret 0.9.6. This was a bug which should now be fixed. Better
yet, wait until acts_as_ferret works with Ferret 0.10. Jens Kraemer is
already working on it and I have few bugs to work out. As for hyphens
not working, it sounds like the same analyzer is not being used for
the queries but you'd have to check with one of the acts_as_ferret
developers that you are using it correctly.

Cheers,
Dave
B4d2ff48e40d96385900a4c943ce0899?d=identicon&s=25 Cameron Hickey (cameronhickey)
on 2006-08-24 17:34
David Balmain wrote:
> Get Ferret 0.9.6. This was a bug which should now be fixed. Better
> yet, wait until acts_as_ferret works with Ferret 0.10. Jens Kraemer is
> already working on it and I have few bugs to work out. As for hyphens
> not working, it sounds like the same analyzer is not being used for
> the queries but you'd have to check with one of the acts_as_ferret
> developers that you are using it correctly.


I am currently using 0.9.6 after reading about the update somewhere
else.  I didn't have luck switching the analyzer, so i tried the
alternate solution suggested elsewhere on this forum, which was just
stripping the STOP words out of the original query.  This actually seems
to work somewhat well, but doesn't solve the hypen problem.

Can you clarify what you mean about using the same analyzer for queries?
Is there any reason that hyphenated terms would not be getting indexed
and searched properly by default?

Thanks

Cam
B5e329ffa0cc78efbfc7ae2d084c149f?d=identicon&s=25 David Balmain (Guest)
on 2006-08-25 16:43
(Received via mailing list)
On 8/25/06, Cameron Hickey <rails@thepattern.net> wrote:
> else.  I didn't have luck switching the analyzer, so i tried the
> alternate solution suggested elsewhere on this forum, which was just
> stripping the STOP words out of the original query.  This actually seems
> to work somewhat well, but doesn't solve the hypen problem.
>
> Can you clarify what you mean about using the same analyzer for queries?
> Is there any reason that hyphenated terms would not be getting indexed
> and searched properly by default?

This was in fact a bug. Thanks Cameron. It is fixed in subversion now.

Cheers,
Dave
This topic is locked and can not be replied to.