On 9/1/06, Ian Z. [email protected] wrote:
I’m not sure why sorting by :id is so slow. It takes like 60 seconds or
more to return a query sorted by id, and only like 0.5 seconds when not
sorted. Weird.
Hi Ian,
Try optimizing the index.
Sorting results by a field will naturally take a little longer then
sorting the results by relevancy because an index needs to be built
for that field. Once the sort-index is built it is cached for the
IndexReader so future sorts should be almost as fast getting unsorted
results.
To build the index Ferret needs to iterate through all the terms in
the index. This takes significantly longer for unoptimized indexes.
Here is a quick benchmark you can try running;
require 'ferret'
include Ferret
words = %w{one two three four five six seven eight nine ten}
i = I.new
start_time = Time.now
100000.times { i << {:id => rand(1000000), :content =>
words[rand(10)]}}
puts “Building index took #{Time.new - start_time} seconds”
start_time = Time.now
i.search("one", :sort => :id)
puts "Sort by integer took #{Time.new - start_time} seconds the
first time"
start_time = Time.now
i.search("one", :sort => :id)
puts "Sort by integer took #{Time.new - start_time} seconds the
second time"
i.__send__(:ensure_writer_open) # get rid of sort cache
start_time = Time.now
i.search("one", :sort => [Ferret::Search::SortField.new(:id, :type
=> :byte)])
puts “Sort by bytes took #{Time.new - start_time} seconds the first
time”
start_time = Time.now
i.search("one", :sort => [Ferret::Search::SortField.new(:id, :type
=> :byte)])
puts “Sort by bytes took #{Time.new - start_time} seconds the second
time”
puts "\nOPTIMIZING THE INDEX\n"
start_time = Time.now
i.optimize
puts "Optimizing the index took #{Time.new - start_time} seconds"
start_time = Time.now
i.search("one", :sort => :id)
puts "Sort by integer took #{Time.new - start_time} seconds the
first time"
start_time = Time.now
i.search("one", :sort => :id)
puts "Sort by integer took #{Time.new - start_time} seconds the
second time"
i.__send__(:ensure_writer_open) # get rid of sort cache
start_time = Time.now
i.search("one", :sort => [Ferret::Search::SortField.new(:id, :type
=> :byte)])
puts “Sort by bytes took #{Time.new - start_time} seconds the first
time”
start_time = Time.now
i.search("one", :sort => [Ferret::Search::SortField.new(:id, :type
=> :byte)])
puts “Sort by bytes took #{Time.new - start_time} seconds the second
time”
And here are the results on my system;
Building index took 36.131648 seconds
Sort by integer took 15.39588 seconds the first time
Sort by integer took 0.002627 seconds the second time
Sort by bytes took 15.889957 seconds the first time
Sort by bytes took 0.001914 seconds the second time
OPTIMIZING THE INDEX
Optimizing the index took 0.639831 seconds
Sort by integer took 0.170887 seconds the first time
Sort by integer took 0.001423 seconds the second time
Sort by bytes took 0.029054 seconds the first time
Sort by bytes took 0.001424 seconds the second time
So optimizing the index before sorting should help a lot.
Cheers,
Dave