doris icon indicating copy to clipboard operation
doris copied to clipboard

[feature-wip](vectorized) Support block aggregate in scanner

Open zenoyang opened this issue 3 years ago • 2 comments

Proposed changes

Issue Number: close #10083 In scanner thread, the same key of adjacent rows is aggregated, which reduces data transmission and improves the parallelism of aggregation.

Note: The current ScanNode does not have column pruning, and the predicate column is included when comparing the key column, resulting in a poor aggregation effect. After waiting for column pruning support, optimize this problem.

Problem Summary:

Describe the overview of changes.

Checklist(Required)

  1. Does it affect the original behavior: (Yes/No/I Don't know)
  2. Has unit tests been added: (Yes/No/No Need)
  3. Has document been added or modified: (Yes/No/No Need)
  4. Does it need to update dependencies: (Yes/No)
  5. Are there any changes that cannot be rolled back: (Yes/No)

Further comments

If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...

zenoyang avatar Jun 13 '22 03:06 zenoyang

Hi, I got a compile fail on clang

segment_iterator.cpp:893:43: error: result of comparison of constant 18446744073709551615 with expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare]
            } else if (0xffffffffffffffff == mask) {

maybe we can use UINT_MAX here.

BiteTheDDDDt avatar Jun 21 '22 08:06 BiteTheDDDDt

Hi, I got a compile fail on clang

segment_iterator.cpp:893:43: error: result of comparison of constant 18446744073709551615 with expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare]
            } else if (0xffffffffffffffff == mask) {

maybe we can use UINT_MAX here.

Fixed

zenoyang avatar Jun 24 '22 11:06 zenoyang

PR approved by at least one committer and no changes requested.

github-actions[bot] avatar Sep 16 '22 03:09 github-actions[bot]

PR approved by anyone and no changes requested.

github-actions[bot] avatar Sep 16 '22 03:09 github-actions[bot]

I think we could merge it to 1.1-lts first. For master, there are lots of issues related with parallelism. For example, there is only one node for bitmap_union function. we could improve performance by using multiple thread. And also we may reduce hash table build or probe time by using multithread. I think we should consider about Pipeline mechanism to unify all parallel related work.

We could talk about pipeline next week @liutang123 @wangbo @zenoyang

yiguolei avatar Oct 26 '22 02:10 yiguolei

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and feel free a maintainer to remove the Stale tag!

github-actions[bot] avatar Apr 25 '23 00:04 github-actions[bot]