matrixone icon indicating copy to clipboard operation
matrixone copied to clipboard

[Performance]: rollup and time window function. aggregate before sort.

Open fengttt opened this issue 1 year ago • 1 comments

Is there an existing issue for performance?

  • [X] I have checked the existing issues.

Environment

- Version or commit-id (e.g. v0.1.0 or 8b23a93):
- Hardware parameters:
- OS type:
- Others:

Details of Performance

Need to implement efficient processing of rollup and time window function. For both cases, we should run aggregate first (call it phase 1), then sort, and process sorted agg on phase 2. While we actually can hash agg on phase 2 too, in general sort it once is easier/better for these two cases and it is usually what user expect anyway.

Additional information

For example: Have you compared MatrixOne with other databases? If yes, what's their difference?

fengttt avatar Aug 20 '24 05:08 fengttt

to be tested

aronchanisme avatar Oct 23 '24 09:10 aronchanisme

Performance will be recorded in this thread: https://github.com/matrixorigin/MO-Cloud/issues/4307

aronchanisme avatar Oct 24 '24 03:10 aronchanisme

https://github.com/matrixorigin/MO-Cloud/issues/4307#issuecomment-2461199214

aronchanisme avatar Nov 25 '24 03:11 aronchanisme

rollup 已经是 agg before,无需改动,time window 优化后插入1000w行数据性能对比: ddl: CREATE TABLE IF NOT EXISTS tb_test ( ts TIMESTAMP primary key, temperature FLOAT, humidity INT ); sql: select _wstart, _wend, max(temperature), min(humidity), sum(humidity) from tb_test interval(ts, 100, second); commit: ae8f3607ba76c95a4828a8ec94d1ff91dd354ccb

Image

commit: 4cfb6bc5701a798989efdec7877829947cb307e4

Image

优化后数据量大时,性能提升一个数量级

iamlinjunhong avatar May 22 '25 08:05 iamlinjunhong

confirm,closed

heni02 avatar Jun 13 '25 02:06 heni02