lucene icon indicating copy to clipboard operation
lucene copied to clipboard

LUCENE-10207: Add "slow" term-in-set query support to SortedDocValuesField / SortedSetDocValuesField

Open gsmiller opened this issue 4 years ago • 0 comments

Description

This change introduces "slow" term-in-set query support exposed through SDV and SSDV Fields. These "slow" queries can be combined with standard TermInSet queries in an IndexOrDocValues query for efficient term-in-set querying when both postings and doc values exist for a given string field.

Solution

TermInSetQuery now extends MultiTermQuery, which allows it to be rewritten as a "doc values query" using DocValuesRewriteMethod. scorerSupplier support was added to MultiTermQueryConstantScoreWrapper and DocValuesRewriteMethod, allowing up-front cost estimation without doing the "heavy lifting" of intersecting the query terms with the indexed terms (while the "doc values" based approach doesn't require score estimation, IndexOrDocValuesQuery creates ScorerSuppliers for both delegate queries up-front, so we want to avoid the term intersection cost).

Tests

Added new unit test coverage for the updated functionality. Note that I also needed to update TestPresearcherMatchCollector since it relies on toString(). Previously, the TermInSetQuery was being rewritten directly to a BooleanQuery. Now though, it gets rewritten to MultiTermQueryConstantScoreWrapper which handles the BooleanQuery rewriting on-demand when a Scorer is requested. Because these two queries have different toString representations, the test started failing (even though it's functionally equivalent).

Checklist

Please review the following and check all that apply:

  • [x] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • [x] I have created a Jira issue and added the issue ID to my pull request title.
  • [x] I have given Lucene maintainers access to contribute to my PR branch. (optional but recommended)
  • [x] I have developed this patch against the main branch.
  • [x] I have run ./gradlew check.
  • [x] I have added tests for my changes.

gsmiller avatar Nov 11 '21 23:11 gsmiller