How to compile with large file support?

Hi,

I’m trying to figure out how to compile ferret with large file support,
but none of the topics that discuss this actually say How this is done.
Can someone please provide the info?

thanks.
-m

my exact problem:
http://www.ruby-forum.com/topic/94143#191630

this topic also discusses the issue:
http://www.ruby-forum.com/topic/84237#151791

this topic says that the FAQ should have the answer, which it doesn’t:
http://www.ruby-forum.com/topic/84205#151312

Mike,
How are you coming on this? I just built an index that tops out at
just above 2GBs and I installed Ferret with the standard gem install
ferret routine.

du -h indexes/final/
2.0G indexes/final/

I’m curious why I didn’t encounter the same issue you did. I just
combined a 1.7GB index with four indexes of approximately 100MB each.

Erik

First, the limit is 2**31, so it’s a little bit more than 2.14e9
bytes. Second, large file support is compiled in by default. There’s
just some stray ints that should be off_t, particularly in the storage
code. I’m going to submit the patch this weekend, after I clean out
some extra debug code… If you don’t store fields, you’ll prolly be
fine.


Kyle M.
Software Engineer
CastTV, Inc

How are you coming on this? I just built an index that tops out at
just above 2GBs and I installed Ferret with the standard gem install
ferret routine.

I’m attaching a patch against ferret trunk rev 770. It’s got a little
cruft, but it fixes large fiel support in Ferret.

How are you coming on this? I just built an index that tops out at
just above 2GBs and I installed Ferret with the standard gem install
ferret routine.

perhaps your index is just a few bytes under the max… my usage is at
3.5G. i haven’t done anything special, just using ferret and AAF gems:

— MODEL CODE
class MyModel < ActiveRecord::Base

think of body/title in terms of an average blog

acts_as_ferret :fields => { ‘body’ => {}, ‘title’ => { :boost => 2 } }
end

— INDEX CODE

new index from scratch

index =
Ferret::Index::Index.new(MyModel.aaf_configuration[:ferret].dup.update(:auto_flush
=> false, :field_infos => MyModel.aaf_index.field_infos, :create =>
true))

n = 0
BATCH_SIZE = 1000

while true
records = MyModel.find(:all, :limit => BATCH_SIZE, :offset => n,
:select =>
“id,#{MyModel.aaf_configuration[:ferret_fields].keys.join(’,’)}”)
break if (!records || records.length == 0)

records.each do |record|
index << record.to_doc # aaf method
end

n += BATCH_SIZE
end

index.flush
index.optimize # 30+ minutes =(
index.close

— CONFIG

gem list | grep ferret
acts_as_ferret (0.4.0)
ferret (0.11.4)

uname -a
Linux gentoo 2.6.20-hardened #3 SMP Fri Mar 30 19:27:10 UTC 2007 x86_64
Intel® Pentium® D CPU 3.00GHz GenuineIntel GNU/Linux

large file support is compiled in by default. There’s
just some stray ints that should be off_t, particularly in the storage
code. I’m going to submit the patch this weekend, after I clean out
some extra debug code… If you don’t store fields, you’ll prolly be
fine.

cool, i hope this fixes the problem. i’ll wait for the next gem version
and see what happens. =)

thanks
-m