Forum: Ferret Inconsistent results when using wild card queries

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
David W. (Guest)
on 2007-07-04 16:05
We get some unexpected results when using wild card queries. We're using
aaf and Ferret 0.11.4

For exampel, when seraching on a part of a collegues name (kristofer)
and limiting it to a specific source_id:

Query: source_id:25 AND kri*
Result: 2 documents. None of them containg the word kristofer, but other
matching words, as "kring" and "kringå" (swedish)

Query: source_id:25 AND kris*
Result: 0 documents.

Query: source_id:25 AND krist*
Result: 12 document. Works as expected.

The index contains in total about 200 000 documents and I've tried
rebuilding and optimizing with no result.

Has anyone else experienced something similar? Any ideas how to fix it?

Thanks!

/David W.
unknown (Guest)
on 2007-07-05 13:05
(Received via mailing list)
David W. <removed_email_address@domain.invalid> writes:

Hi,

> Has anyone else experienced something similar? Any ideas how to fix it?

Unfortunatly i've also experienced that kind of weirdness. And most of
the time it as to do with accentuation.
i'm unable to match a single é if I search for *é* (while it works
with
wordwithé)If i search for e it highlights single e, but it doesn't for single
a...

Sorry to say that, but at the moment I'm considering using another
search enigne. (since I also have very weird unresolved issues with
highlighting)
I'm looking at xapian at the moment.
Jens K. (Guest)
on 2007-07-05 13:43
(Received via mailing list)
On Thu, Jul 05, 2007 at 11:05:36AM +0200, removed_email_address@domain.invalid 
wrote:
> David W. <removed_email_address@domain.invalid> writes:
>
> Hi,
>
> > Has anyone else experienced something similar? Any ideas how to fix it?
>
> Unfortunatly i've also experienced that kind of weirdness. And most of
> the time it as to do with accentuation.
> i'm unable to match a single é if I search for *é* (while it works
> with wordwithé)

I don't know if this is acceptable for you in terms of result exactness,
but you might consider replacing accentuated chars with their
ascii-counterparts during analysis.

> If i search for e it highlights single e, but it doesn't for single
> a...

wild guess - maybe this is because a is a stopword and e isn't?
In general highlighting 'e' works, as does highlighting 'a', as long as
you use an analyzer with empty stopword list:

require 'ferret'
include Ferret
i = I.new :analyzer => Analysis::StandardAnalyzer.new([])
i << 'A tree in the woods'
i << 'Some sentence with e'
i.highlight 'a', 0, :field => :id
# => ["<b>A</b> tree in the woods"]
i.highlight 'e', 1, :field => :id
# => ["Some sentence with <b>e</b>"]


Jens

--
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
removed_email_address@domain.invalid | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
unknown (Guest)
on 2007-07-05 14:20
(Received via mailing list)
Jens K. <removed_email_address@domain.invalid> writes:

>> > Has anyone else experienced something similar? Any ideas how to fix it?
>>
>> Unfortunatly i've also experienced that kind of weirdness. And most of
>> the time it as to do with accentuation.
>> i'm unable to match a single é if I search for *é* (while it works
>> with wordwithé)
>
> I don't know if this is acceptable for you in terms of result exactness,
> but you might consider replacing accentuated chars with their
> ascii-counterparts during analysis.

Thanks for your quick answers Jens.
It could be acceptable, but the highlighting problems I've discovered
are stopping me from doing any further development.
Unfortunatly I don't have time to fix them myself and Dave seems very
busy. :(

sorry if it sounds like whinging :)

Cheers
Jens K. (Guest)
on 2007-07-05 14:32
(Received via mailing list)
On Thu, Jul 05, 2007 at 12:19:52PM +0200, removed_email_address@domain.invalid 
wrote:
> > ascii-counterparts during analysis.
>
> Thanks for your quick answers Jens.
> It could be acceptable, but the highlighting problems I've discovered
> are stopping me from doing any further development.
> Unfortunatly I don't have time to fix them myself and Dave seems very
> busy. :(

if you really like to switch, did you consider acts_as_solr? it's API is
much like aaf's.


Jens

--
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
removed_email_address@domain.invalid | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
unknown (Guest)
on 2007-07-05 18:56
(Received via mailing list)
Jens K. <removed_email_address@domain.invalid> writes:

> if you really like to switch, did you consider acts_as_solr? it's API is
> much like aaf's.

I certainly would if I was ok to use java. :) (but i'm not)
at the moment, I'm considering hyperestraier and xapian.
If there were a python api + rails plugin (and also as much features
as ferret) that would be perfect :)
I haven't really looked/tested yet :)
Jens K. (Guest)
on 2007-07-05 19:02
(Received via mailing list)
On Thu, Jul 05, 2007 at 04:55:59PM +0200, removed_email_address@domain.invalid 
wrote:
> Jens K. <removed_email_address@domain.invalid> writes:
>
> > if you really like to switch, did you consider acts_as_solr? it's API is
> > much like aaf's.
>
> I certainly would if I was ok to use java. :) (but i'm not)

afair you need no Java skills to get Solr running, however you'll need
some spare server resources, that's for sure ;-)

> at the moment, I'm considering hyperestraier and xapian.
> If there were a python api + rails plugin (and also as much features
> as ferret) that would be perfect :)

Solr has an http interface, so talking to it from python would be no big
deal.

Otherwise you could now, possibly being the first user of xapian in a
rails app,
start your very own acts_as_xapian ;-)


Jens


--
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
removed_email_address@domain.invalid | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
This topic is locked and can not be replied to.