Arjun Singh Bora
Arjun Singh Bora
Functionality of checking quota for every job is left unchanged that can be done in the other PR. Handling parallel flow functionality is also left unchanged for the other PR.
Also, be aware that this will break any job which needs more time than 15 mins.
Yes, let's use both the configs and mark one as deprecated.
Thanks for the review. But actually no longer need this PR. Maybe need to add some new logs.
I found a way to overcome if timing out is the issue. One can increase fork.record.queue.timeout. If that does not help, will ask for review on this PR. Thanks!
Can you add a 'Description' in the PR. I did not understand why you are trying to make a HelixTask return Failed when it is cancelled?
I see. Just keep that in mind that sometimes, we do not want to reschedule it, e.g. when user cancelled the job in GaaS. Is there a way to do...
I think we should not set the status "Failed" when the last execution is running. We should instead emit a new event "SKIPPED". With this any further execution should be...
Should a) isFailedDag be ONLY within DagNode always, everywhere, with up-to-date value? b) isFailedDag be within DagNode, but be also in mysql table as a cache; still be always in...