Hey Dave,
I just contributed $100 to the ferret donation box. My project is
earning no money yet (but hopefully will), for now I hope this helps you
out and covers me for asking stupid questions ;).
To get a distance sorted output, I am passing an array of the id field
from a ferret search through to mysql in a custom select statement.
SELECT … id IN (#{ids.join(",")})
This has been working fine through ferret 0.9. I moved to 0.10 this week
and it has been ok but I’m not sure if I just wasn’t ‘activating’ the
error. It happens on 0.10.6 and on 0.10.7.
Today the sql statement was invalid on a certain query. This turned out
to be because 1 or more of the ids passed into the IN statement were not
numbers but some sort of wierd character sequence like \240\236D\010 or
\350\240\227\010.
I’ve tried deleting the index and rebuilding it. It keeps happening,
although on different items in the index on each rebuild. This happens
on 2 different machines, each Debian sarge. Below is a little console
script with output showing the oddness.
The relevant model code is at the bottom of this post, please let me
know if there’s anything else I can supply.
Sam
--------ruby script/console
Entry.create_ferret_index
index = Ferret::Index::Index.new(FerretConfig::INDEXOPTIONS)
an arbitrary query to return all results from index
index.search_each("", {:limit => 6000}) do |doc, score|
if docindex !~ /^\d$/ then # show me ids that aren’t numeric
p doc.to_s + " " + docindex = index[doc][:id]
end
end
OUTPUT FROM THE ABOVE 1st TIME
“542 \2102\032”
“2294 0\3075\010”
“4186 \250* \010”
OUTPUT FROM THE ABOVE 2nd TIME
“1762 \260\020\036\010”
“2617 \000\000\000\000”
“2719 0+\010" "3176 p
0\010”
---------------from entry.rb
def self.create_ferret_index()
field_infos = Ferret::Index::FieldInfos.new(:store => :no, :index =>
:yes, :term_vector => :no, :boost => 1.0)
field_infos.add_field(:name, :store => :no, :index => :yes,
:term_vector => :with_positions_offsets, :boost => 10.0)
field_infos.add_field(:address, :store => :no, :index => :yes,
:term_vector => :with_positions_offsets, :boost => 1.0)
field_infos.add_field(:tags, :store => :no, :index => :yes,
:term_vector => :with_positions_offsets, :boost => 5.0)
field_infos.add_field(:id, :store => :yes, :index => :untokenized,
:term_vector => :no)
field_infos.create_index(FerretConfig::INDEXPATH)
index = Ferret::Index::Index.new(FerretConfig::INDEXOPTIONS)
batch_size = 1000
Entry.transaction do
0.step(Entry.count, batch_size) do |i|
Entry.find(:all, :limit => batch_size, :offset => i).each do |rec|
index << rec.make_entry_ferret_doc
end
end
end
index.flush
index.optimize
index.close
end
def make_entry_ferret_doc
doc = Ferret::Document.new
doc[:id] = self.id
doc[:name] = self.name
doc[:address] = self.physical_address
doc[:tags] = self.tags
doc
end