Forum: Ferret Custom highlighter/match vector access?

Posted by Andrew S. Townley (Guest)
on 2011-02-23 17:16
(Received via mailing list)
Hi everyone,

I know from the archives things have kinda slowed down on ferret and 
there's an effort ongoing with lucy, but I was wondering if anyone had 
discovered a way to enumerate the matches of a particular field in the 
document and get the offsets?

With what I'm trying to do, ferret will be indexing large portions of 
structured information, but I really don't want to store it all in the 
ferret index just to have highlighting.  My understanding (I'm still new 
at this) is that if you index and store the match offsets, you can do 
this without storing the full text of the field.

Ideally, what I'd like is to expose  the contents of the C MatchRange 
structure as an array of Ruby hash objects so that I could then use 
those offsets in the actual data store to create my own highlighted 
extracts (or something along those lines).

Short of adding a hacked version of searcher_highlight to the C API to 
do this and creating a corresponding wrapped Ruby version, is there any 
way to get to this information right now from the Ruby API?

Alternatively, is there another/better way to do this besides storing 
the whole field values and using the built-in highlighter?

Any advice or pointers would be really appreciated.

Cheers,

ast
--
Andrew S. Townley <ast@atownley.org>
http://atownley.org
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.