Recent changes to v1_node_condition.py causing ValueError for MCEHardwareErrors on Oracle OKE cloud platform
Changes for 23.3.0: https://github.com/kubernetes-client/python/commit/b227345fb27fdee65b3218e0d7af18748d78469d
Includes a hard coded set of conditions which may not include all supported conditions for different cloud providers:
https://github.com/kubernetes-client/python/blob/master/kubernetes/client/models/v1_node_condition.py#L217
On Oracle's OKE cloud platform, it includes an additional condition: MCEHardwareErrors
Which is causing a ValueError when the 23.3.0 client is used to list nodes:
File "/usr/local/lib/python3.6/site-packages/kubernetes/client/models/v1_node_condition.py", line 221, in type
.format(type, allowed_values)
ValueError: Invalid value for `type` (MCEHardwareErrors), must be one of ['DiskPressure', 'MemoryPressure', 'NetworkUnavailable', 'PIDPressure',
Node conditions should be more dynamic based on the cloud provider used.
Same for GCP/GKE:
.../lib/python3.9/site-packages/kubernetes/client/models/v1_pod_readiness_gate.py", line 78, in condition_type
raise ValueError(
ValueError: Invalid value for `condition_type` (cloud.google.com/load-balancer-neg-ready), must be one of ['ContainersReady', 'Initialized', 'PodScheduled', 'Ready']
This is about pod conditions or pod readiness gates, but it looks to me like is has the same root cause as the node condition issue of @koaps
Edit: it happened first after upgrading from 21.7.0 to 23.3.0
I have the same issue :
File "/usr/lib/python3.6/site-packages/kubernetes/client/models/v1_node_condition.py", line 221, in type
[2022-03-02T09:48:00.771Z] E .format(type, allowed_values)
[2022-03-02T09:48:00.771Z] E ValueError: Invalid value for `type` (FrequentDockerRestart), must be one of ['DiskPressure', 'MemoryPressure', 'NetworkUnavailable', 'PIDPressure', 'Ready']
Piling on here .. Getting similar error in AWS world in combination with AWS load balancer controller which injects readiness gate to pods . Below is the error I see :
File "../lib/python3.9/site-packages/kubernetes/client/models/v1_pod_readiness_gate.py", line 52, in __init__
self.condition_type = condition_type
File "../lib/python3.9/site-packages/kubernetes/client/models/v1_pod_readiness_gate.py", line 78, in condition_type
raise ValueError(
ValueError: Invalid value for `condition_type` (target-health.elbv2.k8s.aws/k8s-<name>-b77a905f9d), must be one of ['ContainersReady', 'Initialized', 'PodScheduled', 'Ready']
Same for Azure/AKS
ValueError: Invalid value for type (ContainerRuntimeProblem), must be one of ['DiskPressure', 'MemoryPressure', 'NetworkUnavailable', 'PIDPressure', 'Ready']
This is being fixed in upstream. We will cut a new 1.23 client to backport the fix once the PR https://github.com/kubernetes/kubernetes/pull/108740 is merged
Is there any mitigation available while a permanent fix is being developed yet?
Is there any mitigation available while a permanent fix is being developed yet?
We faced the issue too. We would like to know a workaround until a new version will be released.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
@roycaihw has this been backported?
I see the change that was breaking it was reverted in https://github.com/kubernetes-client/python/pull/1789
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
/remove-lifecycle stale
Do you still see this issue? I didn't check the last release myself recently, but last time I checked, code was fixed.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.