How to implement full-text search with OR just like google?


#1

The current full-text search will return the AND collection results,for
example,if we use Article.search(“aa bb”),then the articles that include
“aa” and “bb” in the fields will be returned,how to return the articles
that include “aa” OR “bb” effectly? A stumb method is to setup two
queries respectly and collect them together with remove the repeated
items,how ever paginator will be very difficult to use because it is
hard to locate the offset when processing the next page.

How google do that?
if we input aa bb in the search box,the results has the relation of OR
but not AND,with paginator followed.

Can anybody help me with that?

Thanks a lot


#2

Hey George,

a little mor information please… What are you using to provide your
fulltext-feature:
http://poocs.net/articles/2006/04/06/introducing-acts_as_searchable or
http://projects.jkraemer.net/acts_as_ferret/wiki or something completely
different.

By the way: Google returns “AND” results at least for the highest ranked
results as well. This is what the user is expecting when he is typing in
more than one word. IMHO it would be most user friendly to return “AND”
results if there are some, “OR” if not (and show the user that there are
no
results for his “AND” query). Maybe you might give the user the
opportunity
to search more vague (by “OR”) if there were already “AND”-results
returned…

Regards
Jan


#3

Thanks,i will try acts_as_ferret

But I can not start the demo app:
F:\Download\acts_as_ferret_trunk\demo>ruby script/server
=> Booting WEBrick…
F:\Download\acts_as_ferret_trunk\demo>

The webrick aborted unnormally,if i install the plugin to my own
app,same abort happens,can u tell me the reason?

Jan P. wrote:

Hey George,

a little mor information please… What are you using to provide your
fulltext-feature:
http://poocs.net/articles/2006/04/06/introducing-acts_as_searchable or
http://projects.jkraemer.net/acts_as_ferret/wiki or something completely
different.

By the way: Google returns “AND” results at least for the highest ranked
results as well. This is what the user is expecting when he is typing in
more than one word. IMHO it would be most user friendly to return “AND”
results if there are some, “OR” if not (and show the user that there are
no
results for his “AND” query). Maybe you might give the user the
opportunity
to search more vague (by “OR”) if there were already “AND”-results
returned…

Regards
Jan


#4

have you installed ferret beforehand?

gem install ferret

Regards
Jan


#5

has got to be create database ferret_demo; create database ferret_test;
of
course…

Jan


#6

Hi,Jan
I have tested acts_as_ferret,and it perform well.However,there are two
difficults that prevent me from using:

  1. It does not support utf8’s searching,if i search other unicode
    language,it always returns nothing
  2. It is difficult to combine the paginator,how to get sum of the
    resultsets that the paginator uses?
    @results =
    Tutorial.find_by_contents(@query,:first_doc=>10,:num_docs=>10)
    can be used to search with the offset and limit,but the sum is hard to
    get.

Can you help with that? very thanks!

George

Jan P. wrote:

has got to be create database ferret_demo; create database ferret_test;
of
course…

Jan


#7

The README of the demo isn’t exactly detailed, I’ve tried it myself on
windows since your path indicates that your on a windows box. The steps:

  1. gem install ferret
  2. cd to the demo\db directory
  3. mysql -uroot -p
  4. create ferret_demo; create ferret_test;
  5. use ferret_demo; source schema.sql; - do the same for ferret_test
  6. cd to demo\config and put in values that satisfy your environment
  7. cd to demo root
  8. rake >> are there any failures? rule them out…
  9. ruby script/server
  10. http://localhost:3000/content - put in some of it
  11. http://localhost:3000/search - enjoy ferret fulltext search and
    learn
    about it’s query language on http://ferret.davebalmain.com it should
    satisfy
    your needs
  12. maybe / hopefully: post a patch for a better demo readme to
    acts_as_ferret trac and participate…

Regards
Jan


#8

Hi, George,

may I at first ask which languages you want to index?

I think you are on windows, is this right? IMHO there is still a problem
with the native extension (cFerret, the superfast c library of ferret)
to be
build on windows. I think because of this the stemming files like
*stem_UTF_8_portuguese.h
stem_ISO_8859_1_italian.h *won’t be build on your system. As I’ve
understood
cFerret solved many - if not all - UTF-8 problems, but as I’ve said you
might be bound to the pure ruby version on windows until the problems of
building cFerret on windows are ruled out.

Have you got a linux box available? If you’ve got some time (but don’t
hold
your breath :wink: you might wait for the problems of ferret on windows
being
solved, or provide patches yourself: http://ferret.davebalmain.com .
Have
you had a look at acts_as_searchable and hyperestraier if this does what
you
need? Personally I’m preferring ferret and acts_as_ferret because of
it’s
lucene roots and a great community but hyperestraier is a superb library
too
and acts_as_searchable (which is based on hyperestraier) might be the
right
tool for your job…

Regards
Jan P.


#9

Hi,Jan
I am developing the web app under windows and plan to deploy the app on
linux,therefore at first all the tasks have to be done in win32.
I plan to search multi language which the most important two is English
and chinese,i get nothing if the input text is Chinese.I will try both
ferret and acts_as_searchable and have a comparison…

Jan P. wrote:

Hi, George,

may I at first ask which languages you want to index?

I think you are on windows, is this right? IMHO there is still a problem
with the native extension (cFerret, the superfast c library of ferret)
to be
build on windows. I think because of this the stemming files like
*stem_UTF_8_portuguese.h
stem_ISO_8859_1_italian.h *won’t be build on your system. As I’ve
understood
cFerret solved many - if not all - UTF-8 problems, but as I’ve said you
might be bound to the pure ruby version on windows until the problems of
building cFerret on windows are ruled out.

Have you got a linux box available? If you’ve got some time (but don’t
hold
your breath :wink: you might wait for the problems of ferret on windows
being
solved, or provide patches yourself: http://ferret.davebalmain.com .
Have
you had a look at acts_as_searchable and hyperestraier if this does what
you
need? Personally I’m preferring ferret and acts_as_ferret because of
it’s
lucene roots and a great community but hyperestraier is a superb library
too
and acts_as_searchable (which is based on hyperestraier) might be the
right
tool for your job…

Regards
Jan P.