helix icon indicating copy to clipboard operation
helix copied to clipboard

Prioritizing Helix tasks in Helix state model

Open jackjlli opened this issue 4 years ago • 7 comments

Is your feature request related to a problem? Please describe.

Currently Helix tasks are picked up by the participant based on the inqueue time. While there is some scenario that the tasks which are queued later need to be picked up first (due to some constraints like disk usage, etc).

Describe the solution you'd like Helix should be able to provide the ability to prioritize the tasks based on some weighted value.

Additional context

E.g. there is a customized Helix state model called "SegmentOnlineOfflineStateModel" in Pinot: https://github.com/apache/pinot/blob/a5f3dc507e6441baca35dae5bbfad122356683b6/pinot-server/src/main/java/org/apache/pinot/server/starter/helix/SegmentOnlineOfflineStateModelFactory.java

The state transition "OFFLINE->ONLINE" downloads new segment to local disk, and the one "OFFLINE->DROPPED" deletes segment from local disk.

While we notice that the state transition "OFFLINE->ONLINE" always comes before "OFFLINE->DROPPED", which makes the pinot server busy downloading new segments and then fills up the full disk.

E.g. there are 20 threads in the pool but it seems those threads will pick up the task from the queue in FIFO order.

[Running, pool size = 20, active threads = 20, queued tasks = 558, completed tasks = 1538]

jackjlli avatar Oct 04 '21 22:10 jackjlli

@junkaixue and @jiajunwang which release of helix can we possibly get this in? What may be the time frame? thanks.

mcvsubbu avatar Oct 04 '21 23:10 mcvsubbu

Let me clarify first. The Helix tasks mentioned here are the state transition tasks instead of the Task Framework tasks, right?

jiajunwang avatar Oct 08 '21 02:10 jiajunwang

@jiajunwang Correct, this only denotes the state transition tasks, not the task framework task.

jackjlli avatar Oct 13 '21 22:10 jackjlli

Hello, Picking up this thread.

Helix understands 2 kinds of messages which results in state-transitions:

  • recovery rebalance and regular load-balance

Users can prioritize which task-transition takes precedent: eg:

"listFields": { "STATE_PRIORITY_LIST": [ "LEADER", "STANDBY", "OFFLINE", "DROPPED" ], "STATE_TRANSITION_PRIORITYLIST": [ "LEADER-STANDBY", "STANDBY-LEADER", "OFFLINE-STANDBY", "STANDBY-OFFLINE", "OFFLINE-DROPPED" ] },

Please look at the above and see if that works for Pinot usecase

desaikomal avatar Aug 24 '22 18:08 desaikomal

@desaikomal AFAIK, the original request is about Participant side priority. Currently, once multiple tasks are assigned to a Participant, they will be queue up depends on local thread pool availibility. And there is no customizable mechanism to control which task to pick first from the queue. The config you mentioned control which message to send first from the controller side.

jiajunwang avatar Aug 24 '22 21:08 jiajunwang

good to hear from you @jiajunwang :-) thanks for the context. will update the thread once we have some design in-place

desaikomal avatar Aug 25 '22 03:08 desaikomal

Sure, my pleasure : ) Thanks for working on this request.

jiajunwang avatar Aug 25 '22 19:08 jiajunwang

@mgao0 implemented the feature where user can specify customized threadpool for different ST messages: https://github.com/apache/helix/pull/2390

Hope the change addresses this requirement.

@jackjlli - can you please mark this issue FIXed as a I remember Molly worked with you all to get this closed.

thanks, komal

desaikomal avatar May 22 '23 17:05 desaikomal

We have confirmed that by override the getExecutorService() method in StateModelFactory will resolve this issue. Please close this issue as FIXED once you are satisfied with the requet.

desaikomal avatar May 23 '23 17:05 desaikomal