feat: continue visit children expr of short-circuit expr
Which issue does this PR close?
Closes #.
Rationale for this change
Recently. I find that some common subexpr doesn't be eliminated via CommonSubexprEliminate optimizer. for example
below logical_plan don't reuse expr rest.a + test.b
Filter: test.a + test.b = Int32(2) OR test.a + test.b = Int32(3) OR test.c > Int32(4)
Projection: test.a, test.b, test.c
TableScan: test
I notice that test.a + test.b is child expr of short-circuit expr test.a + test.b = Int32(2) OR test.a + test.b = Int32(3). according to new datafusion code in common_subexpr_eliminate.rs:
https://github.com/apache/datafusion/blob/1155b0b15e6ce3a8d5d28e5ecaebf4706448c548/datafusion/optimizer/src/common_subexpr_eliminate.rs#L856-L859
optimizer will jump short-circuit expr and don't visit children expr recursively, that it will lose the possibility of discovering that children expressions are common.
I think it should be mainly to check whether it needs to count its expression rather than to avoid exploring child expressions recursively. so, I think need to remove short-circuit expr checking from func f_down and add checking in below code on func f_up:
https://github.com/apache/datafusion/blob/1155b0b15e6ce3a8d5d28e5ecaebf4706448c548/datafusion/optimizer/src/common_subexpr_eliminate.rs#L878-L882
After modification. we can't get new optimized plan:
Projection: test.a, test.b, test.c
Filter: __common_expr_1 = Int32(2) OR __common_expr_1 = Int32(3) OR test.c > Int32(4)
Projection: test.a + test.b AS __common_expr_1, test.a, test.b, test.c
Projection: test.a, test.b, test.c
TableScan: test
What changes are included in this PR?
allow continue visit children expr of short-circuit expr and avoid count short-circuit expr itself.
Are these changes tested?
yes, and I add test case test_filter_expressions for reused child expr of short-circuit expr.
Are there any user-facing changes?
no.
This test isn't passing CI, so marking it as draft while that is resolved. Let us know if you need help @zhuliquan