pinot icon indicating copy to clipboard operation
pinot copied to clipboard

Fix TextMatchFilterOptimizer `or not` behavior

Open itschrispeck opened this issue 1 year ago • 1 comments

Lucene's query language has a constraint where NOT operator cannot be used with just one term, since it relies on the difference of sets. Therefore, the pinot query x OR NOT y and the lucene query x OR NOT y are not equivalent. Rather, x OR NOT y in Lucene is equivalent to simply x.

This patch skips optimizing these cases to ensure query correctness.

From the Lucene docs:

Note: The NOT operator cannot be used with just one term. For example, the following search will return no results:
NOT "jakarta apache"

itschrispeck avatar Jul 03 '24 18:07 itschrispeck

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 62.08%. Comparing base (59551e4) to head (766397d). Report is 2193 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master   #13533      +/-   ##
============================================
+ Coverage     61.75%   62.08%   +0.33%     
+ Complexity      207      198       -9     
============================================
  Files          2436     2557     +121     
  Lines        133233   140892    +7659     
  Branches      20636    21856    +1220     
============================================
+ Hits          82274    87479    +5205     
- Misses        44911    46788    +1877     
- Partials       6048     6625     +577     
Flag Coverage Δ
custom-integration1 <0.01% <0.00%> (-0.01%) :arrow_down:
integration <0.01% <0.00%> (-0.01%) :arrow_down:
integration1 <0.01% <0.00%> (-0.01%) :arrow_down:
integration2 0.00% <0.00%> (ø)
java-11 62.05% <100.00%> (+0.34%) :arrow_up:
java-21 61.95% <100.00%> (+0.33%) :arrow_up:
skip-bytebuffers-false 62.08% <100.00%> (+0.33%) :arrow_up:
skip-bytebuffers-true 61.93% <100.00%> (+34.20%) :arrow_up:
temurin 62.08% <100.00%> (+0.33%) :arrow_up:
unittests 62.08% <100.00%> (+0.33%) :arrow_up:
unittests1 46.66% <100.00%> (-0.23%) :arrow_down:
unittests2 27.66% <0.00%> (-0.08%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codecov-commenter avatar Jul 03 '24 19:07 codecov-commenter

@jasperjiaguo - Can you help take a quick look at this for our use ?

siddharthteotia avatar Jul 11 '24 00:07 siddharthteotia

nit: I think x OR NOT y is equal to x NOT y (from my testing)

jasperjiaguo avatar Jul 17 '24 07:07 jasperjiaguo