doris icon indicating copy to clipboard operation
doris copied to clipboard

[Improment]Add workload condition query used memory

Open wangbo opened this issue 1 year ago • 4 comments

Proposed changes

Add workload condition query used memory, we can kill queries based on memory usage.

1 create a policy which can kill query used memory exceeds 100M.

create workload policy memory_used_policy conditions(query_be_memory_bytes > 104857600) actions(cancel_query);

2 submit a query.

mysql [hits]>insert into hits2 select * from hits;
ERROR 1105 (HY000): errCode = 2, detailMessage = (10.16.10.8)[INTERNAL_ERROR]query 81c39e2b3ecf461c-bc78ad6d9b6173d2 cancelled by workload policy memory_used_policy, id:29028

wangbo avatar May 30 '24 09:05 wangbo

Thank you for your contribution to Apache Doris. Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website. See Doris Document.

doris-robot avatar May 30 '24 09:05 doris-robot

run buildall

wangbo avatar May 30 '24 09:05 wangbo

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar May 30 '24 10:05 github-actions[bot]

TeamCity be ut coverage result: Function Coverage: 36.26% (9229/25449) Line Coverage: 27.62% (75682/274025) Region Coverage: 26.83% (39165/145958) Branch Coverage: 23.60% (19888/84280) Coverage Report: http://coverage.selectdb-in.cc/coverage/e1141a84ceecf9e60b00e5d9ca8d060dd7c88877_e1141a84ceecf9e60b00e5d9ca8d060dd7c88877/report/index.html

doris-robot avatar May 30 '24 11:05 doris-robot

the metric name should be consistent with other metrics in be, for example: be_scan_rows,The number of rows scanned by an SQL within a single BE process, and if there are multiple concurrency, it is the cumulative value of multiple concurrency. be_scan_bytes,The number of bytes scanned by an SQL within a single BE process, and if there are multiple concurrency, it is the cumulative value of multiple concurrency. query_time,The running time of an SQL on a single BE process, measured in milliseconds.

So that I think the name should be be_memory_used_bytes,

yiguolei avatar Jun 03 '24 02:06 yiguolei

PR approved by at least one committer and no changes requested.

github-actions[bot] avatar Jun 03 '24 03:06 github-actions[bot]

PR approved by anyone and no changes requested.

github-actions[bot] avatar Jun 03 '24 03:06 github-actions[bot]

run buildall

yiguolei avatar Jun 03 '24 06:06 yiguolei

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Jun 03 '24 06:06 github-actions[bot]

run buildall

wangbo avatar Jun 03 '24 07:06 wangbo

clang-tidy review says "All clean, LGTM! :+1:"

github-actions[bot] avatar Jun 03 '24 07:06 github-actions[bot]

TeamCity be ut coverage result: Function Coverage: 36.29% (9240/25461) Line Coverage: 27.65% (75841/274282) Region Coverage: 26.87% (39273/146167) Branch Coverage: 23.60% (19911/84356) Coverage Report: http://coverage.selectdb-in.cc/coverage/803cc8292bfc458c204d386434d35b64e60cdead_803cc8292bfc458c204d386434d35b64e60cdead/report/index.html

doris-robot avatar Jun 03 '24 08:06 doris-robot

TPC-H: Total hot run time: 40410 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 803cc8292bfc458c204d386434d35b64e60cdead, data reload: false

------ Round 1 ----------------------------------
q1	17944	4825	4452	4452
q2	2611	196	199	196
q3	11725	1233	1183	1183
q4	10593	880	817	817
q5	7550	2730	2687	2687
q6	228	136	135	135
q7	976	615	604	604
q8	9272	2104	2069	2069
q9	9118	6545	6557	6545
q10	8989	3759	3705	3705
q11	452	248	239	239
q12	420	221	223	221
q13	18903	2958	2971	2958
q14	269	225	216	216
q15	516	470	463	463
q16	526	397	391	391
q17	993	703	702	702
q18	7998	7465	7399	7399
q19	6074	1505	1572	1505
q20	660	331	312	312
q21	4954	3273	4011	3273
q22	398	342	338	338
Total cold run time: 121169 ms
Total hot run time: 40410 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4412	4265	4268	4265
q2	369	278	285	278
q3	3018	2766	2766	2766
q4	1819	1600	1630	1600
q5	5271	5253	5294	5253
q6	216	128	127	127
q7	2153	1770	1753	1753
q8	3218	3336	3324	3324
q9	8366	8298	8324	8298
q10	3855	3647	3643	3643
q11	602	503	515	503
q12	760	599	607	599
q13	16577	2955	2966	2955
q14	300	258	258	258
q15	519	473	477	473
q16	474	432	415	415
q17	1773	1508	1467	1467
q18	7825	7622	7397	7397
q19	1685	1636	1661	1636
q20	1990	1789	1803	1789
q21	9114	4727	4798	4727
q22	616	534	514	514
Total cold run time: 74932 ms
Total hot run time: 54040 ms

doris-robot avatar Jun 03 '24 08:06 doris-robot

TPC-DS: Total hot run time: 172146 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 803cc8292bfc458c204d386434d35b64e60cdead, data reload: false

query1	921	381	384	381
query2	6459	2455	2377	2377
query3	6655	211	212	211
query4	19961	17316	17294	17294
query5	4155	452	448	448
query6	253	161	153	153
query7	4589	299	293	293
query8	325	290	284	284
query9	8674	2510	2462	2462
query10	429	296	289	289
query11	10505	10017	9945	9945
query12	141	89	86	86
query13	1657	361	384	361
query14	10067	6933	7671	6933
query15	233	190	189	189
query16	7794	275	276	275
query17	1807	521	511	511
query18	1901	273	266	266
query19	196	145	144	144
query20	90	86	89	86
query21	209	130	137	130
query22	4323	4116	3894	3894
query23	33582	33205	32857	32857
query24	12078	2786	2872	2786
query25	663	354	362	354
query26	1822	153	160	153
query27	3016	323	318	318
query28	7736	2118	2134	2118
query29	1128	607	618	607
query30	297	148	153	148
query31	940	735	706	706
query32	86	51	52	51
query33	761	287	269	269
query34	1020	486	471	471
query35	742	597	591	591
query36	1088	957	923	923
query37	280	67	76	67
query38	2803	2776	2717	2717
query39	867	798	784	784
query40	281	125	124	124
query41	54	50	55	50
query42	121	101	97	97
query43	586	543	561	543
query44	1253	728	743	728
query45	196	166	166	166
query46	1082	760	736	736
query47	1881	1779	1776	1776
query48	377	291	298	291
query49	1202	397	413	397
query50	789	396	385	385
query51	6940	6828	6721	6721
query52	110	93	96	93
query53	394	291	286	286
query54	976	451	449	449
query55	72	71	74	71
query56	279	260	259	259
query57	1153	1030	1050	1030
query58	259	239	235	235
query59	3380	3154	3177	3154
query60	290	267	273	267
query61	106	91	89	89
query62	666	466	456	456
query63	314	289	288	288
query64	9899	2200	1779	1779
query65	3180	3103	3143	3103
query66	1351	329	323	323
query67	15373	14877	14906	14877
query68	4442	586	554	554
query69	462	306	306	306
query70	1083	1110	1117	1110
query71	394	280	283	280
query72	7201	5215	5127	5127
query73	752	331	329	329
query74	6002	5538	5442	5442
query75	3387	2690	2661	2661
query76	2698	943	928	928
query77	444	293	294	293
query78	10248	9939	9729	9729
query79	2487	509	508	508
query80	1206	469	465	465
query81	583	220	218	218
query82	737	103	104	103
query83	237	167	165	165
query84	244	85	86	85
query85	1899	274	266	266
query86	516	323	321	321
query87	3268	3108	3058	3058
query88	3796	2464	2473	2464
query89	473	384	378	378
query90	1805	198	187	187
query91	133	108	172	108
query92	69	48	56	48
query93	1892	524	503	503
query94	1233	191	187	187
query95	404	308	313	308
query96	593	277	273	273
query97	3204	3049	3013	3013
query98	237	217	208	208
query99	1129	825	816	816
Total cold run time: 276075 ms
Total hot run time: 172146 ms

doris-robot avatar Jun 03 '24 08:06 doris-robot

ClickBench: Total hot run time: 30.84 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 803cc8292bfc458c204d386434d35b64e60cdead, data reload: false

query1	0.04	0.04	0.03
query2	0.08	0.04	0.04
query3	0.23	0.05	0.04
query4	1.69	0.07	0.07
query5	0.51	0.49	0.50
query6	1.13	0.72	0.72
query7	0.02	0.01	0.02
query8	0.05	0.04	0.04
query9	0.54	0.47	0.49
query10	0.55	0.54	0.55
query11	0.15	0.11	0.11
query12	0.15	0.13	0.12
query13	0.59	0.58	0.60
query14	0.78	0.78	0.77
query15	0.82	0.80	0.81
query16	0.36	0.35	0.36
query17	1.02	1.03	0.97
query18	0.20	0.26	0.23
query19	1.80	1.66	1.72
query20	0.02	0.01	0.01
query21	15.71	0.68	0.66
query22	3.92	6.94	2.46
query23	18.30	1.34	1.22
query24	1.57	0.32	0.20
query25	0.14	0.08	0.08
query26	0.28	0.17	0.17
query27	0.08	0.08	0.08
query28	13.35	1.01	0.99
query29	13.21	3.36	3.34
query30	0.24	0.06	0.06
query31	2.86	0.39	0.37
query32	3.30	0.48	0.47
query33	2.89	2.83	2.90
query34	17.25	4.43	4.36
query35	4.50	4.52	4.47
query36	0.65	0.44	0.46
query37	0.18	0.15	0.16
query38	0.15	0.15	0.14
query39	0.05	0.04	0.04
query40	0.17	0.14	0.15
query41	0.09	0.04	0.05
query42	0.05	0.04	0.05
query43	0.04	0.04	0.04
Total cold run time: 109.71 s
Total hot run time: 30.84 s

doris-robot avatar Jun 03 '24 08:06 doris-robot

PR approved by at least one committer and no changes requested.

github-actions[bot] avatar Jun 04 '24 02:06 github-actions[bot]