flinkk8soperator icon indicating copy to clipboard operation
flinkk8soperator copied to clipboard

Failure deployment doesn't change the state in flink operator

Open lydian opened this issue 3 years ago • 0 comments

According to the states, if dual mode, either failure in ClusterStarting or SubmittingJob will leads to the RollingBackJob mode. However, when I tested, I noticed that

  1. In ClusterStaring state, the deployment failed (usually due to the image not exists or the sidecar is not injected properly that we are missing some packages), the flink app will stuck in ClusterStarting and not really goes into the RollingBackJob state.
  2. In SubmittingJob, sometime bad beam python code written and then it will also stuck in SubmittingJob state, and I can see that it keeps trying to resubmit the job (and showing error log in flink operator) instead of changing to RollingBackJob state as mentioned in the doc

Wondering if I am missing some configuration which leads to this issue. Thanks!

lydian avatar Jul 11 '22 22:07 lydian