kvass icon indicating copy to clipboard operation
kvass copied to clipboard

After scale down, the pod may be created again because the pod still terminating.

Open GoneLikeAir opened this issue 5 years ago • 3 comments

I found that after scale down, the pod may be created again because the pod still terminating. Here is the code: func (s *shardManager) Shards() ([]*shard.Shard, error) { pods, err := s.getPods(s.sts.Spec.Selector.MatchLabels) // **Here will also get the pods that is terminating** if err != nil { return nil, errors.Wrap(err, "list pod") } ... return ret, nil }

GoneLikeAir avatar Mar 12 '21 04:03 GoneLikeAir

This is not a problem since NotReady Pods will never participate Target assignment.

RayHuangCN avatar Mar 19 '21 04:03 RayHuangCN

This is not a problem since NotReady Pods will never participate Target assignment.

Yes, NoReady pod will no participate target assignment. However, if it takes longer to terminate the Pod than the coordinator loop interval, it will be created again.

Let's look at the logic that calculate the 'scale' in coordinator.go:

shards := getPod(owner by prometheus statefulset)  // include the pod status is terminating
scale := len(shards) // Because the 'shards' contains the terminating pod, scale >= sts.replicas

Here only focus on tryScaleUp(shardInfo), because the 'changeAble' flag of the pod in terminating state is false, so it will not run into 'scale--'.

Finally, ChangeScale will create the pods that in terminating state again.

GoneLikeAir avatar Apr 27 '21 10:04 GoneLikeAir

🤔 That seems to be a bug.

RayHuangCN avatar May 18 '21 07:05 RayHuangCN