calcite icon indicating copy to clipboard operation
calcite copied to clipboard

[CALCITE-4465] Estimate the number of distinct values by predicates

Open liyafan82 opened this issue 5 years ago • 0 comments

According to our current implementation (RelMdDistinctRowCount), estimating the number of distinctive values (NDV) does not make good use of the filter condition. It simply forwards the call to its input operator with the fiter condition attached.

In fact, more information can be obtained for some special but commonly used conditions. For example, given condition x = 'a', we can deduce that NDV( x ) <= 1. Given condition x in ('a', 'b'), we can deduce that NDV( x ) <= 2. More generally, if we have x in ('a', 'b') AND y in ('c', 'd', 'e'), we have NDV(x, y) <= 2 * 3 = 6.

liyafan82 avatar Jan 19 '21 01:01 liyafan82