Hey all,
I’m getting some really weird results when searching documents. It
seems to be somehow related to the document format I’m using.
I wrote a small script to replicate it:
################
#!/usr/bin/ruby
require ‘rubygems’
require ‘ferret’
include Ferret
index = Index::Index.new(:path => ‘/tmp/fooindex’, :key => :id)
dummy data
index << {:visibility=>“private”, :type=>“media”, :title=>“example
title”, :owner=>“user/3003”, :author=>“user/3003”,
:description=>“description example”, :id=>“user/3003/media/1”}
index << {:visibility=>“private”, :type=>“media”, :title=>“a new
title”, :owner=>“user/3003”, :author=>“user/3003”, :description=>“more
foo desc”, :id=>“user/3003/media/2”}
index << {:visibility=>“private”, :type=>“media”, :title=>“random
title”, :owner=>“user/3003”, :author=>“user/3003”,
:description=>“random description”, :id=>“user/3003/media/4”}
index << {:visibility=>“private”, :type=>“media”, :title=>“random
title”, :owner=>“user/3003”, :author=>“user/3003”,
:description=>“random description”, :id=>“user/3003/media/5”}
index.search_each(ARGV.shift) { |doc, score|
puts index[doc].load.inspect
}
################
The following queries are returning all the results currently in the
index:
$ ruby script.rb “title:me”
{:author=>“user/3003”, :description=>“description example”,
:visibility=>“private”, :id=>“user/3003/media/1”, :title=>“example
title”, :type=>“media”, :owner=>“user/3003”}
… (remaining results)
$ ruby script.rb “title:my”
(same as above)
And weird enough, the following
$ ruby script.rb “title:mo”
Won’t return anything. There’s more variants to that, but I think you
get my meaning.
The following works properly:
$ ruby script.rb “title:random”
(returns the two results that contain “random” in the title, which is
what is supposed to be)
Is there something I’m missing? It doesn’t seem to make sense to me
that those queries above should return all the results in the index,
specially considering they don’t actually match anything.
Any help is appreciated. Thanks.