fix(taskframework): job is not started in resource idle node when deployed multi-nodes
What type of PR is this?
type-bug
What this PR does / why we need it:
job is not started when multi nodes has enough resource, when JobStore acquire trigger but resource is not available , we should release it and give a chance to other node acquire trigger.
Which issue(s) this PR fixes:
Special notes for your reviewer:
test case with two nodes, task be started in equalization in two nodes.
Additional documentation e.g., usage docs, etc.:
please add more info to tell the reviewer that how does this issue happened and how do you solve it, I can get the issue and solution by your code @krihy
I can not accept your solution:
- StartPreparingJob is a daemon job on every node, we'd better not disable it on a specific node
- we have already got rateLimiter
MonitorProcessRateLimiter, why does it work? - Will other nodes be affected if you
releaseAcquiredTriggerinResourceDetectJobStore?
I can not accept your solution:
- StartPreparingJob is a daemon job on every node, we'd better not disable it on a specific node
- we have already got rateLimiter
MonitorProcessRateLimiter, why does it work?- Will other nodes be affected if you
releaseAcquiredTriggerinResourceDetectJobStore?
in cluster model, all quartz node seize trigger, and be fired exact only once, if one node with no resource acquire trigger and fired it, the job will not be started in StartPreparingJob in QuartzSchedulerThread while loop and other node with enough resource will not acquire trigger
I can not accept your solution:
- StartPreparingJob is a daemon job on every node, we'd better not disable it on a specific node
- we have already got rateLimiter
MonitorProcessRateLimiter, why does it work?- Will other nodes be affected if you
releaseAcquiredTriggerinResourceDetectJobStore?in cluster model, all quartz node seize trigger, and be fired exact only once, if one node with no resource acquire trigger and fired it, the job will not be started in
StartPreparingJobinQuartzSchedulerThreadwhile loop and other node with enough resource will not acquire trigger
as we discused before, in cluster model, each node should do StartPreparingJob
I can not accept your solution:
- StartPreparingJob is a daemon job on every node, we'd better not disable it on a specific node
- we have already got rateLimiter
MonitorProcessRateLimiter, why does it work?- Will other nodes be affected if you
releaseAcquiredTriggerinResourceDetectJobStore?in cluster model, all quartz node seize trigger, and be fired exact only once, if one node with no resource acquire trigger and fired it, the job will not be started in
StartPreparingJobinQuartzSchedulerThreadwhile loop and other node with enough resource will not acquire trigger
I can not accept your solution:
- StartPreparingJob is a daemon job on every node, we'd better not disable it on a specific node
- we have already got rateLimiter
MonitorProcessRateLimiter, why does it work?- Will other nodes be affected if you
releaseAcquiredTriggerinResourceDetectJobStore?in cluster model, all quartz node seize trigger, and be fired exact only once, if one node with no resource acquire trigger and fired it, the job will not be started in
StartPreparingJobinQuartzSchedulerThreadwhile loop and other node with enough resource will not acquire trigger
so, why doesn't we rely on MonitorProcessRateLimiter but disable the StartPreparingJob?