[Ferret] QueryParser memory leak bug (Joyent/OpenSolaris)

QueryParser fails badly allocating enormous amount of memory when
processing query strings with special/accented characters. See:

irb(main):002:0> require ‘rubygems’

irb(main):003:0> require ‘ferret’

irb(main):004:0> include Ferret

irb(main):005:0> index = Index::Index.new

irb(main):008:0> index << “something”

Now the error while processing “bolô”

irb(main):009:0> query = index.process_query(“bo\303\264”)
/opt/csw/lib/ruby/gems/1.8/gems/ferret-0.11.4/lib/ferret/index.rb:749:in
parse': failed to allocate memory (NoMemoryError) from /opt/csw/lib/ruby/gems/1.8/gems/ferret-0.11.4/lib/ferret/index.rb:749:indo_process_query’
from
/opt/csw/lib/ruby/gems/1.8/gems/ferret-0.11.4/lib/ferret/index.rb:676:in
process_query' from /opt/csw/lib/ruby/1.8/monitor.rb:229:insynchronize’
from
/opt/csw/lib/ruby/gems/1.8/gems/ferret-0.11.4/lib/ferret/index.rb:674:in
process_query' from (irb):9:inirb_binding’
from /opt/csw/lib/ruby/1.8/irb/workspace.rb:52:in `irb_binding’
from /opt/csw/lib/ruby/1.8/irb/workspace.rb:52

works fine on my box, i get “id:bolô”

tried it with ferret 0.11.4 and 0.11.3, with and without setting of
KCODE=‘u’
ruby 1.8.5 (2006-12-25 patchlevel 12) [i686-darwin8.8.2]

Hi Phillip, you did that on MacOs, right?

Phillip O. wrote:

works fine on my box, i get “id:bolô”

tried it with ferret 0.11.4 and 0.11.3, with and without setting of
KCODE=‘u’
ruby 1.8.5 (2006-12-25 patchlevel 12) [i686-darwin8.8.2]

yes.
Darwin Phil4.local 8.9.1 Darwin Kernel Version 8.9.1: Thu Feb 22
20:55:00 PST 2007; root:xnu-792.18.15~1/RELEASE_I386 i386 i386

Nice, this issue is just related to OpenSolaris and is caused by not
having a UTF-8 locale in place and also by a bug on the locale libs of
OpenSolaris.

Thanks for the help.

Phillip O. wrote:

yes.
Darwin Phil4.local 8.9.1 Darwin Kernel Version 8.9.1: Thu Feb 22
20:55:00 PST 2007; root:xnu-792.18.15~1/RELEASE_I386 i386 i386