Hello!
I'm using the German stemming analyzer to index a database using
acts_as_ferret. I have some troube with wildcard queries which I'm
extensivly using (and needing) for autocompleter fields.
The problem is the following:
In this example, the indexed model contains street names. Some of these
names are:
* Alte Bürger
* Alter Fährweg
* Am Alten Vorhafen
* ...
So lots of names which Street#find_with_ferret could match. Let's try:
# Street.find_with_ferret "al*"
-> ["Alte Bürger", "Alter Fährweg", "Am Alten Vorhafen", ...]
Fine so far. Next:
# Street.find_with_ferret "alt*"
-> ["Alte Bürger", "Alter Fährweg", "Am Alten Vorhafen", ...]
No let's add another letter:
# Street.find_with_ferret "alt*"
-> []
Whoops, nothing there. It should match all the same list entries. It
looks like this happens to all words added to the index using a stemming
analyzer. Using without wildcards works:
# Street.find_with_ferret "alte"
-> ["Alte Bürger", "Alter Fährweg", "Am Alten Vorhafen", ...]
Something similar happens with other search terms:
--> Database contains "Rasenweg" ("weg" is stripped away by an analyzer
and also a stopword)
# Street.find_with_ferret "rasen*"
-> [] # <-- unexpected
# Street.find_with_ferret "rasen"
-> ["Rasenweg"] # <-- expected
# Street.find_with_ferret "ras*"
-> ["Rasenweg"] # <-- expected
How can I fix this or how is this usually handled? I need to do queries
like this:
# Street.find_with_ferret "(alte~ bü~)||(alte*bü*)"
and it should return "Alte Bürger" in the results. This works when I
reformulate the query to:
# Street.find_with_ferret "alte~||bü~||alte*bü*"
but this delivers way too inaccurate results.
on 2010-12-02 17:30
Please log in before posting. Registration is free and takes only a minute.
Existing account
(Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
Log in with Google account | Log in with Yahoo account
No account? Register here.