NoMemoryError ActiveSupportMultibyte - Normalize

Hello all,

I am importing large word documents. I have a test file which is 30mb
in length. I run this through a series of processes to clean it up
and extract the data i need to inject bits of it into a database.

The 30mb file runs fine on my Mac. But on the FreeBSD server dies
with a NoMemoryError at one of the first steps, which is normalizing
the file.

Basically, the sequence is the user clicks import, this calls
MiddleMan and starts a backgrounDRB task to import the file. The file
is scanned through using Hpricot to get an idea on size and then is
sent to normalize to sanitze the Unicode data.

The ruby process running this is at about 500mb at this point…

And at this point it crashes out with the stack trace attached. Any
ideas? As I said, working without flaw on my MacBookPro. Also, the
process works 100% OK on a file that is about 3-4mb.

Document 10, worker job_key 1182240116-VMS10 = Book import started at
Tue Jun 19 18:01:56 +1000 2007.
failed to allocate memory - (NoMemoryError)
/usr/local/lib/ruby/gems/1.8/gems/activesupport-1.4.2/lib/
active_support/multibyte/handlers/utf8_handler.rb:299:in unpack' /usr/local/lib/ruby/gems/1.8/gems/activesupport-1.4.2/lib/ active_support/multibyte/handlers/utf8_handler.rb:299:inu_unpack’
/usr/local/lib/ruby/gems/1.8/gems/activesupport-1.4.2/lib/
active_support/multibyte/handlers/utf8_handler.rb:207:in normalize' /usr/local/lib/ruby/gems/1.8/gems/activesupport-1.4.2/lib/ active_support/multibyte/chars.rb:81:insend’
/usr/local/lib/ruby/gems/1.8/gems/activesupport-1.4.2/lib/
active_support/multibyte/chars.rb:81:in method_missing' /data/rails/universal_translator/releases/20070619100151/config/../ script/backgroundrb/../../config/../lib/extractor.rb:22:incharacter_and_paragraph_styles_from_word’
/data/rails/universal_translator/releases/20070619100151/config/…/
script/backgroundrb/…/…/config/…/lib/workers/book_import_worker.rb:
53:in do_work' /data/rails/universal_translator/releases/20070619100151/config/../ script/backgroundrb/../../config/../vendor/plugins/backgroundrb/ backgroundrb_rails.rb:36:instart_process’
/data/rails/universal_translator/releases/20070619100151/config/…/
script/backgroundrb/…/…/config/…/vendor/plugins/backgroundrb/
backgroundrb_rails.rb:32:in initialize' /data/rails/universal_translator/releases/20070619100151/config/../ script/backgroundrb/../../config/../vendor/plugins/backgroundrb/ backgroundrb_rails.rb:32:innew’
/data/rails/universal_translator/releases/20070619100151/config/…/
script/backgroundrb/…/…/config/…/vendor/plugins/backgroundrb/
backgroundrb_rails.rb:32:in start_process' /data/rails/universal_translator/releases/20070619100151/config/../ script/backgroundrb/../../config/../vendor/plugins/backgroundrb/ backgroundrb.rb:57:innew_worker’
/data/rails/universal_translator/releases/20070619100151/config/…/
script/backgroundrb/…/…/config/…/vendor/plugins/backgroundrb/
backgroundrb.rb:49:in synchronize' /data/rails/universal_translator/releases/20070619100151/config/../ script/backgroundrb/../../config/../vendor/plugins/backgroundrb/ backgroundrb.rb:49:innew_worker’
/usr/local/lib/ruby/1.8/drb/drb.rb:1555:in __send__' /usr/local/lib/ruby/1.8/drb/drb.rb:1555:inperform_without_block’
/usr/local/lib/ruby/1.8/drb/drb.rb:1515:in perform' /usr/local/lib/ruby/1.8/drb/drb.rb:1589:inmain_loop’
/usr/local/lib/ruby/1.8/drb/drb.rb:1585:in loop' /usr/local/lib/ruby/1.8/drb/drb.rb:1585:inmain_loop’
/usr/local/lib/ruby/1.8/drb/drb.rb:1581:in start' /usr/local/lib/ruby/1.8/drb/drb.rb:1581:inmain_loop’
/usr/local/lib/ruby/1.8/drb/drb.rb:1430:in run' /usr/local/lib/ruby/1.8/drb/drb.rb:1427:instart’
/usr/local/lib/ruby/1.8/drb/drb.rb:1427:in run' /usr/local/lib/ruby/1.8/drb/drb.rb:1347:ininitialize’
/usr/local/lib/ruby/1.8/drb/drb.rb:1627:in new' /usr/local/lib/ruby/1.8/drb/drb.rb:1627:instart_service’
/data/rails/universal_translator/releases/20070619100151/config/…/
script/backgroundrb/start:98
/data/rails/universal_translator/releases/20070619100151/config/…/
script/backgroundrb/start:94:in fork' /data/rails/universal_translator/releases/20070619100151/config/../ script/backgroundrb/start:94 /data/rails/universal_translator/releases/20070619100151/config/../ script/backgroundrb/start:86:infork’
/data/rails/universal_translator/releases/20070619100151/config/…/
script/backgroundrb/start:86