cloudberry icon indicating copy to clipboard operation
cloudberry copied to clipboard

[enhancement] Direct dispatch with parallel.

Open avamingli opened this issue 2 years ago • 0 comments

Cloudberry Database version

No response

What happened

Direct dispatch in parallel mode may have no improvement compared to a non-parallel direct dispatch. non-parallel direct dispatch:

gpadmin=# explain select a, count(*) from dd_part_singlecol where a=1 group by a;
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Gather Motion 1:1  (slice1; segments: 1)  (cost=0.00..2160.85 rows=467 width=12)
   ->  GroupAggregate  (cost=0.00..2154.62 rows=156 width=12)
         Group Key: dd_part_singlecol.a
         ->  Append  (cost=0.00..2152.28 rows=156 width=4)
               ->  Seq Scan on dd_part_singlecol_1_prt_2 dd_part_singlecol_1  (cost=0.00..358.58 rows=26 width=4)
                     Filter: (a = 1)
               ->  Seq Scan on dd_part_singlecol_1_prt_3 dd_part_singlecol_2  (cost=0.00..358.58 rows=26 width=4)
                     Filter: (a = 1)
               ->  Seq Scan on dd_part_singlecol_1_prt_4 dd_part_singlecol_3  (cost=0.00..358.58 rows=26 width=4)
                     Filter: (a = 1)
               ->  Seq Scan on dd_part_singlecol_1_prt_5 dd_part_singlecol_4  (cost=0.00..358.58 rows=26 width=4)
                     Filter: (a = 1)
               ->  Seq Scan on dd_part_singlecol_1_prt_6 dd_part_singlecol_5  (cost=0.00..358.58 rows=26 width=4)
                     Filter: (a = 1)
               ->  Seq Scan on dd_part_singlecol_1_prt_extra dd_part_singlecol_6  (cost=0.00..358.58 rows=26 width=4)
                     Filter: (a = 1)
 Optimizer: Postgres query optimizer
(17 rows)

gpadmin=# select a, count(*) from dd_part_singlecol where a=1 group by a;
 a | count
---+-------
 1 |     1
(1 row)

Time: 3.517 ms

parallel direct dispatch:

gpadmin=# explain select a, count(*) from dd_part_singlecol where a=1 group by a;
                                                            QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
 Gather Motion 6:1  (slice1; segments: 6)  (cost=1079.26..1091.72 rows=935 width=12)
   ->  Finalize HashAggregate  (cost=1079.26..1080.81 rows=156 width=12)
         Group Key: dd_part_singlecol.a
         ->  Redistribute Motion 2:6  (slice2; segments: 2)  (cost=0.00..1078.87 rows=78 width=12)
               Hash Key: dd_part_singlecol.a
               Hash Module: 3
               ->  Partial GroupAggregate  (cost=0.00..1077.31 rows=78 width=12)
                     Group Key: dd_part_singlecol.a
                     ->  Parallel Append  (cost=0.00..1076.14 rows=78 width=4)
                           ->  Seq Scan on dd_part_singlecol_1_prt_2 dd_part_singlecol_1  (cost=0.00..358.58 rows=26 width=4)
                                 Filter: (a = 1)
                           ->  Seq Scan on dd_part_singlecol_1_prt_3 dd_part_singlecol_2  (cost=0.00..358.58 rows=26 width=4)
                                 Filter: (a = 1)
                           ->  Seq Scan on dd_part_singlecol_1_prt_4 dd_part_singlecol_3  (cost=0.00..358.58 rows=26 width=4)
                                 Filter: (a = 1)
                           ->  Seq Scan on dd_part_singlecol_1_prt_5 dd_part_singlecol_4  (cost=0.00..358.58 rows=26 width=4)
                                 Filter: (a = 1)
                           ->  Seq Scan on dd_part_singlecol_1_prt_6 dd_part_singlecol_5  (cost=0.00..358.58 rows=26 width=4)
                                 Filter: (a = 1)
                           ->  Seq Scan on dd_part_singlecol_1_prt_extra dd_part_singlecol_6  (cost=0.00..358.58 rows=26 width=4)
                                 Filter: (a = 1)
 Optimizer: Postgres query optimizer
(22 rows)

gpadmin=# select a, count(*) from dd_part_singlecol where a=1 group by a;
 a | count
---+-------
 1 |     1
(1 row)

Time: 4.156 ms

Slice2 's origin Motion(6:6) is reduced to Motion(2:6) due to direct_dispatch.

And it have to Gather(6:1) as we will use a parallel plan.

We should reconsider a direct-dispatch able plan in parallel mode.

What you think should happen instead

We should reconsider direct dispatch in parallel mode, it may not be better than a Single process.

How to reproduce

dd_part_singlecol in regression.

Operating System

Ubuntu

Anything else

No response

Are you willing to submit PR?

  • [ ] Yes, I am willing to submit a PR!

Code of Conduct

avamingli avatar Jul 24 '23 02:07 avamingli