I need decide on if our site will go with Java or Ruby on Rails. The major factor is that does Farret support Lucene's ChineseAnalyzer or CJKAnalyzer or not. Can anyboby shine some lights on Farret's Chinese search support? Really appreciate.
on 2006-02-23 01:32
on 2006-02-23 08:55
Hi Jerry, Basically you'll have to write an analyzer that matches Chinese tokens (words). If you can write a regular expression in Ruby that matches Chinese tokens then it's very simple to write an Analyzer for Ferret. I haven't looked at teh CJKAnalyzer in Lucene but I can't imagine it would be too hard to port to Ruby. Cheers, Dave
on 2006-02-23 10:20
There is nothing fancy about the CJKAnalyzer.... it chunks characters into pairs. So the phrase ä½ å¥½å? would be tokenized into two tokens [ä½ å¥½] [å¥½å?]. Erik