hive icon indicating copy to clipboard operation
hive copied to clipboard

HIVE-26443: Add priority queueing to compaction (DRAFT)

Open veghlaci05 opened this issue 3 years ago • 0 comments

What changes were proposed in this pull request?

Compaction queue items now can be labeled, and compaction workers can be assigned to these labels, allowing the segmentation and parallelized processing of the entries.

Why are the changes needed?

The single compaction queue with the workers picking up the compaction requests in the sequence of their arrival just doesn't fit into a multi-tenant environment. Like compaction requests from a use-case into a yarn queue which is shared and/or has limited resources blocking compaction requests where workers would find plenty of resources in other dedicated queues. Compaction requests should be able to be put into different compaction pools where each of these pools has a set of workers assigned to. As a result high priority/urgent requests can use a dedicated compaction pool to avoid being blocked by other (long running) compaction requests.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Manually and through unit tests

veghlaci05 avatar Aug 10 '22 10:08 veghlaci05