[OT] Unicode tokenization for Ferret


#1

I wonder, do we (eventually) have a working Ruby implementation of this

http://www.unicode.org/reports/tr29/

This might come bloody useful not only for Ferret but for the
“excerpt” helper as well

Julian ‘Julik’ Tarkhanov
me at julik.nl