doris icon indicating copy to clipboard operation
doris copied to clipboard

[improve](partial update) sort the rids to read for alignments to reduce the number of random accesses

Open bobhan1 opened this issue 2 years ago • 6 comments

Proposed changes

This PR sorts the rids to read for alignments in partial update to reduce the number of random accesses to improve performance.

Further comments

If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...

bobhan1 avatar Nov 24 '23 06:11 bobhan1

run buildall

bobhan1 avatar Nov 24 '23 06:11 bobhan1

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Nov 24 '23 06:11 github-actions[bot]

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Nov 24 '23 06:11 github-actions[bot]

(From new machine)TeamCity pipeline, clickbench performance test result: the sum of best hot time: 44.7 seconds stream load tsv: 576 seconds loaded 74807831229 Bytes, about 123 MB/s stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s insert into select: 29.6 seconds inserted 10000000 Rows, about 337K ops/s storage size: 17099463712 Bytes

doris-robot avatar Nov 24 '23 08:11 doris-robot

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit 5f70575bd909d27606ff4ee00d7ebb0cccd37e06, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4914	4636	4639	4636
q2	364	143	159	143
q3	2036	1928	1830	1830
q4	1403	1262	1221	1221
q5	3949	3931	3987	3931
q6	252	130	130	130
q7	1450	895	867	867
q8	2815	2795	2791	2791
q9	9691	9838	9469	9469
q10	3450	3488	3501	3488
q11	375	254	250	250
q12	438	289	289	289
q13	4561	3769	3807	3769
q14	316	278	302	278
q15	573	518	520	518
q16	671	596	581	581
q17	1148	957	949	949
q18	7929	7570	7612	7570
q19	1670	1675	1696	1675
q20	526	308	305	305
q21	4453	4009	4085	4009
q22	477	375	360	360
Total cold run time: 53461 ms
Total hot run time: 49059 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4624	4572	4594	4572
q2	343	223	254	223
q3	4030	4000	4010	4000
q4	2731	2727	2718	2718
q5	9781	9780	9666	9666
q6	248	120	125	120
q7	3067	2494	2488	2488
q8	4426	4433	4414	4414
q9	12984	12871	12948	12871
q10	4039	4123	4151	4123
q11	873	729	656	656
q12	978	830	820	820
q13	4319	3551	3553	3551
q14	382	340	353	340
q15	567	510	520	510
q16	723	672	658	658
q17	3806	3865	3858	3858
q18	9764	9310	9236	9236
q19	1802	1782	1806	1782
q20	2411	2086	2071	2071
q21	8827	8585	8504	8504
q22	922	835	855	835
Total cold run time: 81647 ms
Total hot run time: 78016 ms

doris-robot avatar Nov 24 '23 08:11 doris-robot

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and feel free a maintainer to remove the Stale tag!

github-actions[bot] avatar May 23 '24 00:05 github-actions[bot]