Can you give me an example of how to add analyzer to ferret to Asian
My web application will have to support multi language search,which
means,for example,both Chinese and English will be searched through the
Currently,I have decided to use the simple token principles,which means
that every Chinese character will be a token,although this is not so
well in some cases,my database column to be full-text searched include
at most tens of UTF-8 characters,therefore i think it can works well.
Thanks a lot!
David B. wrote:
On 7/5/06, Charlie [email protected] wrote:
Is there any schema of full-text search that support utf-8 especially
for Asia language such as Chinese,Japanese,etc.
Ferret/acts_as_ferret can not work when these language key words are
searched,and also, it is difficult to implement pagination-which need
both the count of search results and offset.
Ferret will work fine on Asian Languages. You just need to write your
own Analyzer which matches tokens correctly for the language you are
interested in. Have a look at the RegExpAnalyzer in Ferret. You can
look at test/unit/analysis/ctc_analyzer.rb to see how it works.