[Bug] Unauthorized error when deleting a cluster
What were you trying to accomplish?
Delete a cluster which has iam service accounts.
eksctl delete cluster -name my-cluster --wait
What happened?
I get an error when eksctl try to determine if the corresponding service account exists in the cluster.
2022-08-26 15:37:27 [✖] checking whether serviceaccount "kube-system/aws-node" exists: Unauthorized
The next run of eksctl delete cluster clean the cluster successfully.
Audit log shows that eksctl try to get the service account using an anonymous user (at least it is not mapped to any user/group):
{
[...]
"requestURI": "/api/v1/namespaces/kube-system/serviceaccounts/aws-node",
"verb": "get",
"user": {},
"objectRef": {
"resource": "serviceaccounts",
"namespace": "kube-system",
"name": "aws-node",
"apiVersion": "v1"
},
"responseStatus": {
"status": "Failure",
"reason": "Unauthorized",
"code": 401
},
[...]
}
(It can happens with any service account, the first to be checks raise the error)
I have no issue with eks delete iamserviceaccount -f config.yaml --include '*/*' -w --approve.
How to reproduce it?
Create a cluster with iam service accounts. Delete the cluster. (It is reproductible 100% of the time in our environement, not sure if that happens with a simpler one)
Logs
https://gist.github.com/vflaux/24e3aac2aefdaaa638764e07c8dc3f79
Anything else we need to know?
Versions
$ eksctl info
eksctl version: 0.109.0
kubectl version: v1.23.8
OS: linux
for me it goes like this
2022-09-22 13:31:33 [ℹ] deleting EKS cluster "..."
2022-09-22 13:31:35 [ℹ] will drain 1 unmanaged nodegroup(s) in cluster "..."
2022-09-22 13:31:35 [ℹ] starting parallel draining, max in-flight of 1
2022-09-22 13:31:35 [ℹ] cordon node "ip-....us-east-2.compute.internal"
2022-09-22 13:32:39 [!] 1 pods are unevictable from node ip-....us-east-2.compute.internal
2022-09-22 13:33:41 [!] 1 pods are unevictable from node ip-....us-east-2.compute.internal
2022-09-22 13:34:44 [!] 1 pods are unevictable from node ip-....us-east-2.compute.internal
2022-09-22 13:35:46 [!] 1 pods are unevictable from node ip-....us-east-2.compute.internal
2022-09-22 13:36:49 [!] 1 pods are unevictable from node ip-....us-east-2.compute.internal
2022-09-22 13:37:52 [!] 1 pods are unevictable from node ip-....us-east-2.compute.internal
2022-09-22 13:38:55 [!] 1 pods are unevictable from node ip-....us-east-2.compute.internal
2022-09-22 13:39:57 [!] 1 pods are unevictable from node ip-....us-east-2.compute.internal
2022-09-22 13:41:00 [!] 1 pods are unevictable from node ip-....us-east-2.compute.internal
2022-09-22 13:42:02 [!] 1 pods are unevictable from node ip-....us-east-2.compute.internal
2022-09-22 13:43:05 [!] 1 pods are unevictable from node ip-....us-east-2.compute.internal
2022-09-22 13:44:07 [!] 1 pods are unevictable from node ip-....us-east-2.compute.internal
2022-09-22 13:45:10 [!] 1 pods are unevictable from node ip-....us-east-2.compute.internal
2022-09-22 13:46:12 [!] 1 pods are unevictable from node ip-....us-east-2.compute.internal
2022-09-22 13:47:15 [!] 1 pods are unevictable from node ip-....us-east-2.compute.internal
2022-09-22 13:48:18 [!] 1 pods are unevictable from node ip-....us-east-2.compute.internal
2022-09-22 13:49:20 [!] 1 pods are unevictable from node ip-....us-east-2.compute.internal
2022-09-22 13:50:23 [!] 1 pods are unevictable from node ip-....us-east-2.compute.internal
2022-09-22 13:51:26 [!] 1 pods are unevictable from node ip-....us-east-2.compute.internal
2022-09-22 13:52:28 [!] 1 pods are unevictable from node ip-....us-east-2.compute.internal
2022-09-22 13:52:59 [!] pod eviction error ("errs: [Unauthorized]") on node ip-....us-east-2.compute.internal
2022-09-22 13:53:04 [✖] Node group drain failed: %!w(*errors.errorString=&{errs: [Unauthorized]})
Error: errs: [Unauthorized]
this came after I created the cluster myself so I assume I'm the owner of the cluster. Why can't the owner delete the cluster?
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
I still encounter this error each time I delete a cluster.
I'm running into the same bug in a different path (eksctl delete nodegroup, with parallelism of 15). It happens exactly at 20 mins, and I wonder if it's just a TTL in the kubeconfig?
Fwiw, smaller nodegroups will delete just fine.
I'm running into the same bug in a different path (eksctl delete nodegroup, with parallelism of 15). It happens exactly at 20 mins, and I wonder if it's just a TTL in the kubeconfig?
Fwiw, smaller nodegroups will delete just fine.
@praneshpandurangan-at, @vflaux, what version of eksctl are you using? A fix was out in 0.116 that should address this issue as well.
I was using 0.112. I just tested with 0.118 and this issue is gone. Fixed by https://github.com/weaveworks/eksctl/pull/5772 I assume. Thanks @cPu1.
I was using 0.112. I just tested with 0.118 and this issue is gone.
Great. Thanks for updating us, @vflaux.
Fixed by #5772 I assume, Thanks @cPu1.
Correct!