In memory IndexReader bug?


#1

Hi All,

Hope all is going well.

I’m having trouble with the following code creating an in memory index
reader - it seems to be attempting to read from a file regardless.
Here’s the simple code:

require ‘rubygems’
require ‘ferret’

a = Ferret::Index::Index.new
r = Ferret::Index::IndexReader.new(nil)

Running the code on my OS X machine gives:

marcus-crafters-powerbook-g4-17:/tmp crafterm$ ruby t.rb
t.rb:5:in `initialize’: : Error occured at <fs_store.c>:318 (Exception)
Error: exception 2 not handled: Couldn’t open the file to read
from t.rb:5

The IndexReader API says pass nil in for an in memory directory, so I’m
not sure what’s wrong.

Is this a bug - any ideas at all? This is ferret 0.9.3 for reference.

Cheers,

Marcus


#2

On 6/14/06, Marcus C. removed_email_address@domain.invalid wrote:

The IndexReader API says pass nil in for an in memory directory, so I’m
not sure what’s wrong.

Is this a bug - any ideas at all? This is ferret 0.9.3 for reference.

Hi Marcus,

Sorry, this is a mistake in the docs. It doesn’t make sense to open an
IndexReader with an anonymous RAMDirectory as it obviously won’t
contain any index yet. The problem is that the IndexReader is trying
to read the segments file which it expects to be there but, since no
index has been written, there is no segments file. If you pass a
RAMDirectory that actually contains an index written by an IndexWriter
or Index class then it should work.

Cheers,
Dave


#3

David B. wrote:

On 6/14/06, Marcus C. removed_email_address@domain.invalid wrote:
Hi Marcus,

Sorry, this is a mistake in the docs. It doesn’t make sense to open an
IndexReader with an anonymous RAMDirectory as it obviously won’t
contain any index yet. The problem is that the IndexReader is trying
to read the segments file which it expects to be there but, since no
index has been written, there is no segments file. If you pass a
RAMDirectory that actually contains an index written by an IndexWriter
or Index class then it should work.

Hi David,

Thanks mate for the information. I actually get the same problem when
attempting to use a writer:

@ramDIr = RAMDirectory.new
@writer = IndexWriter.new(@ramDir)

That gives me:

IOError: No file segments
/sw/lib/ruby/gems/1.8/gems/ferret-0.9.3/lib/ferret/store/ram_store.rb:79:in
open_input' /sw/lib/ruby/gems/1.8/gems/ferret-0.9.3/lib/ferret/index/segment_infos.rb:70:inread’
/sw/lib/ruby/gems/1.8/gems/ferret-0.9.3/lib/ferret/index/index_writer.rb:108:in
initialize' /sw/lib/ruby/gems/1.8/gems/ferret-0.9.3/lib/ferret/store/directory.rb:135:inwhile_locked’
/sw/lib/ruby/gems/1.8/gems/ferret-0.9.3/lib/ferret/index/index_writer.rb:103:in
initialize' /sw/lib/ruby/1.8/monitor.rb:229:insynchronize’
/sw/lib/ruby/gems/1.8/gems/ferret-0.9.3/lib/ferret/index/index_writer.rb:102:in
`initialize’

I didn’t think that was expected. Essentially I’m trying to the
following equivalent Lucene code:

protected void setUp() throws Exception
{
	ramDir = new RAMDirectory();
	IndexWriter writer = new IndexWriter(ramDir, new StandardAnalyzer(), 

true);
for (int i = 0; i < texts.length; i++)
{
addDoc(writer, texts[i]);
}

	writer.optimize();
	writer.close();
	reader = IndexReader.open(ramDir);
	numHighlights = 0;
}

The only way I’ve been able to get it to work is by using an on disk
index rather than a in-memory based one.

Any thoughts?

Cheers,

Marcus


#4

On 6/21/06, Marcus C. removed_email_address@domain.invalid wrote:

or Index class then it should work.

`initialize’
IndexWriter writer = new IndexWriter(ramDir, new StandardAnalyzer(),
}

The only way I’ve been able to get it to work is by using an on disk
index rather than a in-memory based one.

Any thoughts?

You need to set the :create option to true. Or you could use the
:create_if_missing option, it doesn’t really make any difference.
Personally, I’d just use the Index class. You should only need to fall
back to the IndexWriter and IndexReader classes if you are doing more
advanced stuff with index like writing your own filters. Anyway here
is the code for your example;

def setup
    ram_dir = RAMDirectory.new
    writer = IndexWriter.new(nil, :create => true)
    texts.each {|text| writer << text}
    writer.optimize
    writer.close
    reader = IndexReader.open(ram_dir);
end

If you are having trouble debugging something you can try require
‘rferret’ instead of ‘ferret’. This will use the pure Ruby version of
Ferret and you should be able to more easily find the problem.

Hope that helps,
Dave