[feature-wip](vectorized) Support block aggregate in scanner
Proposed changes
Issue Number: close #10083 In scanner thread, the same key of adjacent rows is aggregated, which reduces data transmission and improves the parallelism of aggregation.
Note: The current ScanNode does not have column pruning, and the predicate column is included when comparing the key column, resulting in a poor aggregation effect. After waiting for column pruning support, optimize this problem.
Problem Summary:
Describe the overview of changes.
Checklist(Required)
- Does it affect the original behavior: (Yes/No/I Don't know)
- Has unit tests been added: (Yes/No/No Need)
- Has document been added or modified: (Yes/No/No Need)
- Does it need to update dependencies: (Yes/No)
- Are there any changes that cannot be rolled back: (Yes/No)
Further comments
If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...
Hi, I got a compile fail on clang
segment_iterator.cpp:893:43: error: result of comparison of constant 18446744073709551615 with expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare]
} else if (0xffffffffffffffff == mask) {
maybe we can use UINT_MAX here.
Hi, I got a compile fail on clang
segment_iterator.cpp:893:43: error: result of comparison of constant 18446744073709551615 with expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare] } else if (0xffffffffffffffff == mask) {maybe we can use
UINT_MAXhere.
Fixed
PR approved by at least one committer and no changes requested.
PR approved by anyone and no changes requested.
I think we could merge it to 1.1-lts first. For master, there are lots of issues related with parallelism. For example, there is only one node for bitmap_union function. we could improve performance by using multiple thread. And also we may reduce hash table build or probe time by using multithread. I think we should consider about Pipeline mechanism to unify all parallel related work.
We could talk about pipeline next week @liutang123 @wangbo @zenoyang
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and feel free a maintainer to remove the Stale tag!