Ferret w/ acts as ferret on windows

hi all!

after hours of trying to find contents with german umlauts i stumbled
upon a post where someone said ferret won’t work with utf-8 on
windows???

is that really true?

do i really have to iconv everything to iso-8859-15 before indexing and
do the same with the query to get it working?

i’m running ruby 1.8.5, ferret 0.10.9-mswin32, and rails 1.2.2 and just
reinstalled aaf from svn yesterday (can’t find any version info, and
forgot to remember the svn revision)

bangingheadagainstwall*

On 2/26/07, neongrau __ [email protected] wrote:

hi all!

after hours of trying to find contents with german umlauts i stumbled
upon a post where someone said ferret won’t work with utf-8 on
windows???

is that really true?

do i really have to iconv everything to iso-8859-15 before indexing and
do the same with the query to get it working?

The StandardAnalyzer uses your current locale settings to determine
what a letter is when tokenizing your data. As far as I was able to
determine, Windows doesn’t have support for UTF-8 locales in C and the
win32 libraries. (I’d love for someone to correct me on thise). What
you can do is write a custom analyzer and UTF-8 should be fine. There
has been plenty of discussion on creating your own analyzer in the
past:

http://www.ruby-forum.com/search?query=ferret+analyzer&submit=Search

You can also look in the unit tests.