image-match icon indicating copy to clipboard operation
image-match copied to clipboard

Investigate use of "MoreLikeThis" query

Open rhsimplex opened this issue 9 years ago • 0 comments

Per this suggestion:

You might want to look at Morelikethis queries to boost performance. I worked on a proprietary version of this and at the time Lucene performance dropped off nearly linearly with the number of query terms. We used MoreLikeThis to reduce our queries count to the 30-40 most statistically interesting terms. The one hiccup being an issue in Lucene [1] where the term cache wasn't operating properly. We just added our own image query term cache and a custom MLT query to leverage it, which gave us a 10x speed bump over any other methods we tried. The interestingness of the terms is assessed on a per-term basis though, so you might see a relevence drop for some types of image if you set MoreLikeThis to use too few terms. [1] https://issues.apache.org/jira/browse/LUCENE-1690

Look into MoreLikeThis instead of BoolQuery

rhsimplex avatar Mar 10 '16 15:03 rhsimplex