cortex icon indicating copy to clipboard operation
cortex copied to clipboard

Implement partition compaction grouper

Open alexqyle opened this issue 1 year ago • 0 comments

What this PR does:

This PR implements partition compaction grouper.

Introduced new files for partition compaction:

  • partitioned_group_info: This file acts like a compaction plan. It contains the information that how source blocks from compaction time range being assigned to partitions for compaction. partitionedGroupID in the file is unique for particular time range.
  • partition_visit_marker: Visit marker file for each partition under compaction. This could prevent multiple compactors from working on the same partition compaction. Similar to block visit marker.

Here is high level algorithm of partition compaction grouper:

  1. Group blocks by time range
  2. Load existing partitioned_group_info files
  3. Gathering information of each time range and check which time range where grouper can take compaction job from
  4. Create partitioned groups from grouped blocks
  5. Sanitize partitions from each partitioned group
  6. Return ready to compact partitioned groups to Thanos for compaction

Introduced meta_extensions to save partition information of result block in meta.json. This infomation can be used to better assign block to proper partition in the next round of compaction.

Which issue(s) this PR fixes: NA

Checklist

  • [x] Tests updated
  • [x] Documentation added
  • [ ] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

alexqyle avatar Aug 20 '24 00:08 alexqyle