After scale down, the pod may be created again because the pod still terminating.
I found that after scale down, the pod may be created again because the pod still terminating.
Here is the code:
func (s *shardManager) Shards() ([]*shard.Shard, error) { pods, err := s.getPods(s.sts.Spec.Selector.MatchLabels) // **Here will also get the pods that is terminating** if err != nil { return nil, errors.Wrap(err, "list pod") } ... return ret, nil }
This is not a problem since NotReady Pods will never participate Target assignment.
This is not a problem since NotReady Pods will never participate Target assignment.
Yes, NoReady pod will no participate target assignment. However, if it takes longer to terminate the Pod than the coordinator loop interval, it will be created again.
Let's look at the logic that calculate the 'scale' in coordinator.go:
shards := getPod(owner by prometheus statefulset) // include the pod status is terminating
scale := len(shards) // Because the 'shards' contains the terminating pod, scale >= sts.replicas
Here only focus on tryScaleUp(shardInfo), because the 'changeAble' flag of the pod in terminating state is false, so it will not run into 'scale--'.
Finally, ChangeScale will create the pods that in terminating state again.
🤔 That seems to be a bug.