[SPARK-40355][SQL] Improve pushdown for orc & parquet when cast scenario
What changes were proposed in this pull request?
Table Schema:
| a | b | part |
|---|---|---|
| int | string | string |
SQL:
select * from table where b = 1(integer) and part = 1(integer)
A.Partition column 'part' has been pushed down, but column 'b' not push down.
B.After apply the pr, Partition column 'part' and column 'b' has been pushed down.
Why are the changes needed?
A.Improve query performance and reduce IO reads B.Keep pushedFilters consistent with partitionFilters
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Pass GA & add new UT.
cc @cloud-fan @huaxingao @wangyum FYI
Can one of the admins verify this patch?
We have an optimizer rule UnwrapCastInBinaryComparison, does it solve your problem? And can you give some code pointers to explain why we have inconsistency between data and partition filters today?
We have an optimizer rule
UnwrapCastInBinaryComparison, does it solve your problem? And can you give some code pointers to explain why we have inconsistency between data and partition filters today?
I will try!
Gentle ping, @panbingkun .
UnwrapCastInBinaryComparison
I will try with UnwrapCastInBinaryComparison
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!