How to sort relevance of query terms?


#1

The first three lines are representative of the problem I face. You see
that for the given query (3, butter+oil+anhydrous), Wikipedia returns
Ultralight Backpacking, where I think Ghee is optimal. I’ve had to make
this list because I had been getting things like pork products for
cactus, and it’s noted that the file does come up about 300 records
short. I will deal with that later. For now, I need solid advice on
how to sort the results.

My first idea, make a regexp from the second, query token, e.g.
butter+salted => /[butersald]/, and then taking the shortest term from
the results, but there are plenty of instances where this will not reach
the optimal term, such as turtle+raw => Turtle soup. Your help is
greatly appreciated.

1|butter+salted|Butter salt|Butter|Túrós csusza|Margarine|Potato
2|butter+whipped|Shea butter|Butter|Cream bun|Butter cream|Butter
3|butter+oil+anhydrous|Ultralight backpacking|Odell’s|Ghee|Fragrance
extraction|Perfume