Hi!
I want to index html files, but w/o the tags, so I was thinking either I
remove them before I index it (expensive), or put up an RegExpAnalyzer.
BTW, when using an analyzer, does that mean that everything which it
declines (i.e. the RegExpAnalyzer doesn’t match) won’t be put into the
index files (i.e. blows it up)?
I came up with a simple test, which didn’t work in act_as_ferret, but
now in pure ferret doesn’t work as well. I expected, with the code
below, that only “abc” will be indexed, as only it matches the regexpr.
What’s wrong?
@index = Ferret::Index::Index.new(:path =>
‘c:/projects/peter/lib/ferretidx’,
:analyzer => RegExpAnalyzer.new(/[a-f]/))
@index << {:id => “15”, :title => “Programming Ruby”, :content =>
“some thing abc”}
@index.search_each(‘content:“some”’) do |id, score|
puts “Document #{id} found with a score of #{score}”
end
Thanks a lot,
hawe.