Status value Long duration Creating
version seldon core version 1.13.1
doubt at seldon core , deploy seldonployment on kubernetes; 1、 How do I set the timeout period, other than the long time Creating state? 2、Undo the last deploy? Similar kubernetes ; kubectl rollout undo
at kubernetes website Your Deployment may get stuck trying to deploy its newest ReplicaSet without ever completing. This can occur due to some of the following factors:
- Insufficient quota
- Readiness probe failures
- Image pull errors
- Insufficient permissions
- Limit ranges
- Application runtime misconfiguration
One way you can detect this condition is to specify a deadline parameter in your Deployment spec: (.spec.progressDeadlineSeconds). .spec.progressDeadlineSeconds denotes the number of seconds the Deployment controller waits before indicating (in the Deployment status) that the Deployment progress has stalled.
Once the deadline has been exceeded, the Deployment controller adds a DeploymentCondition with the following attributes to the Deployment's .status.conditions:
type: Progressing
status: "False"
reason: ProgressDeadlineExceeded
at seldon core , The state is always creating。 if seldondeployment.yaml file ,add progressDeadlineSeconds properties , unknown field "progressDeadlineSeconds"
For example,
kubectl apply -f - << END
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
name: canary-example2
namespace: default
spec:
predictors:
- componentSpecs:
- spec:
progressDeadlineSeconds: 60
containers:
- name: classifier
resources:
requests:
cpu: 2000m
memory: 512Mi
limits:
cpu: 2000m
memory: 512Mi
name: baseline1
replicas: 1
traffic: 100
graph:
name: classifier
modelUri: gs://seldon-models/v1.14.0-dev/sklearn/iris
implementation: SKLEARN_SERVER
END
error message: unknown field
"progressDeadlineSeconds" in io.seldon.machinelearning.v1.SeldonDeployment.spec.predictors.componentSpecs.spec;
Thank you very much for your help
Hi @chijunqing,
The elements under spec.predictors.componentSpecs are pod specifications, not deployment ones.
As a result, the progressDeadlineSeconds field does not belong here -- it is not part of a pod spec.
From looking through the controller logic (here in particular), it seems that there is no handling of progressDeadlineSeconds whatsoever.
I believe a pod's activeDeadlineSeconds field would not work either, as the pod would need to be scheduled first.
This would need to be a feature request to consider how to add this deployment field into SDeps.
thanks for your help @agrski yeah ,You are right ; at selodon core ,it seems that there is no handling of progressDeadlineSeconds whatsoever.
Is there any way to solve the following problems?
doubt 1、 How do I set the timeout period, other than the long time Creating state?
kubectl describe deployment -l seldon-deployment-id=canary-example2
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing False ProgressDeadlineExceeded
OldReplicaSets: canary-example2-baseline1-0-classifier-77f4cbffdc (1/1 replicas created)
NewReplicaSet: canary-example2-baseline1-0-classifier-688fdd4f85 (1/1 replicas created)
kubectl describe sdep canary-example2
Deployment Status:
canary-example2-baseline1-0-classifier:
Available Replicas: 1
Replicas: 2
Replicas: 2
State: Creating
Deployment Status-> State value is Creating, I do not know the true status of the newly released service。 so,i want undo the last deploy by state value. but But I can't get the real state
seldon state in [Creating ,Failed,Available] . A new release cannot be marked as a failure by status,if failed ;State is always creating
because ,in kubernetes -> Progressing False ProgressDeadlineExceeded 。
2、Undo the last deploy? Similar kubernetes ; kubectl rollout undo
Hi @chijunqing,
1 - stuck in creating state
It's possible to patch progressDeadlineSeconds into a deployment after its creation:
kubectl patch deployment/<name> -p '{"spec":{"progressDeadlineSeconds": <duration>}}'
So, it's a little bit hacky, but you could write a small script (bash, Python, Go, etc.) that waits for the deployment to exist in k8s then patches its definition to add this field. That's fine to do, and I think shouldn't even affect any pods which have already started up (as the pod spec isn't changing).
However, I'm not sure if the Seldon controller will decide something has changed and undo this; you would need to test to confirm.
2 - undo last deployment
The best way to do this would be to use a GitOps approach, so you can roll back indefinitely, see who made which changes, roll forward again, etc.
However, that might be more effort to set up than is worthwhile for you, and would ignore/override any patches you manually apply (as in 1 above).
Alternatively, and more simply, there's kubectl.kubernetes.io/last-applied-configuration. This annotation applies to SDeps as well as deployments.
So, this doesn't allow you to simply kubectl rollout undo, but you can extract this from an SDep and pipe it through to kubectl apply.
Note that last-applied-configuration only works if you use kubectl apply and not kubectl create.
Again, I'm not sure how this will interact with patches to the underlying deployment -- please do report back whether it works or not.
@agrski ok,Let me verify the first one ; thank you
1 - stuck in creating state
kubectl patch deployment/
The results of kubernetes deployment status
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing False ProgressDeadlineExceeded
OldReplicaSets: canary-example2-baseline1-0-classifier-77f4cbffdc (1/1 replicas created)
NewReplicaSet: canary-example2-baseline1-0-classifier-688fdd4f85 (1/1 replicas created)
seldondeployment status
Deployment Status:
canary-example2-baseline1-0-classifier:
Available Replicas: 1
Replicas: 2
Replicas: 2
State: Creating
conclusion Seldon state does not appear to determine service real publication status ,Or by other means?
From spinning up a kind cluster and testing with the manifest you provided above, it looks like changing a deployment works for a plain deployment, but not one managed by an SDep.
SDep-managed:
$ kubectl -n seldon patch deployments/canary-example2-baseline1-0-classifier -p '{"spec": {"progressDeadlineSeconds": 60}}' \
&& k -n seldon get deploy canary-example2-baseline1-0-classifier -o json \
| jq '.spec | .progressDeadlineSeconds'
deployment.apps/canary-example2-baseline1-0-classifier patched
600
Plain deployment:
$ k -n seldon get deploy -o yaml | sed 's/canary-example2-baseline1-0-classifier/canary-2/g' | kubectl apply -f -
deployment.apps/canary-2 created
$ kubectl -n seldon patch deployment/canary-2 -p '{"spec": {"progressDeadlineSeconds": 60}}' \
&& kubectl -n seldon get deploy canary-2 -o json \
| jq '.spec | .progressDeadlineSeconds'
deployment.apps/canary-2 patched
60
With a little more thinking, this is expected for k8s controllers - the spec changing will be reverted by the reconcile loop in the Seldon operator.
So, this would need to be a feature request. It shouldn't be an especially big change, but we'll need a PR.
This feature is urgent for us. Is there anything I can do for you. About when can YOU create a PR
I may have time to look at this today, although I may not finish it until next week. If it's more urgent, you're welcome to put in a PR over the weekend which either myself or someone else can review.
good