Forum: Ferret Wildcard trouble

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Paul L. (Guest)
on 2009-01-07 02:18
(Received via mailing list)
Hi-- I just ran into an odd situation.  If I do a search including the
term:
         c*  -  I get 4 hits
         ca* - I get the same 4 documents
         co* - I get one new document, not found by c*

Does anyone know what might be going on, or have suggestions for
debugging?

Thanks,
     --Paul

--
Paul L.
Aquilent, Inc.
National Library of Medicine (Contractor)
Jens K. (Guest)
on 2009-02-10 11:27
(Received via mailing list)
Hi Paul,

On 07.01.2009, at 01:17, Paul L. wrote:

> Hi-- I just ran into an odd situation.  If I do a search including
> the term:
>         c*  -  I get 4 hits
>         ca* - I get the same 4 documents
>         co* - I get one new document, not found by c*
>
> Does anyone know what might be going on, or have suggestions for
> debugging?

How does your full query look like? Ferret has a built in default
limit of 512 for the number of terms wildcard queries (and other
MultiTermQueries) can get expanded to. Any more terms matching your
criteria will be dropped then, keeping the most relevant 512 terms.
You can override this value by specifying a max_terms value when
constructing the query via the API:

query = WildcardQuery.new(:field, "c*",
                           :max_terms => 1024)

you might also try monkey patching the
Ferret::Search::MultiTermQuery::default_max_terms method to return
your custom limit so you dont need to use the query API to construct
your queries (i.e. with aaf which doesn't reliably work with query
objects due to the DRb stuff involved).

It *might* also be a bug in Ferret - if the above doesn't help, can
you reproduce this with a simple test case?

cheers,
Jens

--
Jens Krämer
Finkenlust 14, 06449 Aschersleben, Germany
VAT Id DE251962952
http://www.jkraemer.net/ - Blog
http://www.omdb.org/     - The new free film database
Paul L. (Guest)
on 2009-03-03 00:33
(Received via mailing list)
Jens,
  I think you are right.  There are 932 terms matching c* in my data
table.
(The rest of my query is simple-- just onr or two other terms without
wildcards).  I tried setting the value of default_max_terms, but it did
not
seem to have any effect.  (I think I was setting it correctly, because I
tried assigning a negative number and it immediately complained.)
However,
now that I know what that is doing, I'm not sure I want to increase the
value.
  Anyway, thank you very much for your help in sorting this out.
         --Paul
This topic is locked and can not be replied to.