spark icon indicating copy to clipboard operation
spark copied to clipboard

[SPARK-40355][SQL] Improve pushdown for orc & parquet when cast scenario

Open panbingkun opened this issue 3 years ago • 4 comments

What changes were proposed in this pull request?

Table Schema:

a b part
int string string

SQL:

select * from table where b = 1(integer) and part = 1(integer)

A.Partition column 'part' has been pushed down, but column 'b' not push down.

image

B.After apply the pr, Partition column 'part' and column 'b' has been pushed down.

image

Why are the changes needed?

A.Improve query performance and reduce IO reads B.Keep pushedFilters consistent with partitionFilters

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass GA & add new UT.

panbingkun avatar Sep 06 '22 11:09 panbingkun

cc @cloud-fan @huaxingao @wangyum FYI

LuciferYang avatar Sep 06 '22 14:09 LuciferYang

Can one of the admins verify this patch?

AmplabJenkins avatar Sep 07 '22 10:09 AmplabJenkins

We have an optimizer rule UnwrapCastInBinaryComparison, does it solve your problem? And can you give some code pointers to explain why we have inconsistency between data and partition filters today?

cloud-fan avatar Sep 07 '22 12:09 cloud-fan

We have an optimizer rule UnwrapCastInBinaryComparison, does it solve your problem? And can you give some code pointers to explain why we have inconsistency between data and partition filters today?

I will try!

panbingkun avatar Sep 07 '22 13:09 panbingkun

Gentle ping, @panbingkun .

dongjoon-hyun avatar Oct 06 '22 07:10 dongjoon-hyun

UnwrapCastInBinaryComparison

I will try with UnwrapCastInBinaryComparison

panbingkun avatar Oct 06 '22 08:10 panbingkun

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

github-actions[bot] avatar Jan 15 '23 00:01 github-actions[bot]