[Bug] EKSCTL 0.189.0 panic: runtime error: invalid memory address or nil pointer dereference
What were you trying to accomplish?
Using the standard YAML file, I always see the following warning about IRSA deprecation:
2024-08-28 13:50:25 [!] recommended policies were found for "vpc-cni" addon, but since OIDC is disabled on the cluster, eksctl cannot configure the requested permissions; the recommended way to provide IAM permissions for "vpc-cni" addon is via pod identity associations; after addon creation is completed, add all recommended policies to the config file, under `addon.PodIdentityAssociations`, and run `eksctl update addon`
2024-08-28 13:56:59 [!] IRSA has been deprecated; the recommended way to provide IAM permissions for "aws-efs-csi-driver" addon is via pod identity associations; after addon creation is completed, run `eksctl utils migrate-to-pod-identity`
2024-08-28 13:58:31 [!] IRSA has been deprecated; the recommended way to provide IAM permissions for "aws-ebs-csi-driver" addon is via pod identity associations; after addon creation is completed, run `eksctl utils migrate-to-pod-identity`
So I'm trying to create a new cluster with Addon and pod identity agent simultaneously so that I don't need to do migrate-to-pod-identity after cluster has been created, but the process crash due to SIGSEGV when creating vpc-cni addon.
What happened?
This is the error I got
2024-08-28 20:55:34 [ℹ] eksctl version 0.189.0
2024-08-28 20:55:34 [ℹ] using region ap-southeast-2
2024-08-28 20:55:35 [ℹ] setting availability zones to [ap-southeast-2a ap-southeast-2c ap-southeast-2b]
2024-08-28 20:55:35 [ℹ] subnets for ap-southeast-2a - public:192.168.0.0/19 private:192.168.96.0/19
2024-08-28 20:55:35 [ℹ] subnets for ap-southeast-2c - public:192.168.32.0/19 private:192.168.128.0/19
2024-08-28 20:55:35 [ℹ] subnets for ap-southeast-2b - public:192.168.64.0/19 private:192.168.160.0/19
2024-08-28 20:55:35 [ℹ] nodegroup "nodegroup1" will use "" [AmazonLinux2023/1.30]
2024-08-28 20:55:35 [ℹ] using Kubernetes version 1.30
2024-08-28 20:55:35 [ℹ] creating EKS cluster "tseldemo" in "ap-southeast-2" region with managed nodes
2024-08-28 20:55:35 [ℹ] 1 nodegroup (nodegroup1) was included (based on the include/exclude rules)
2024-08-28 20:55:35 [ℹ] will create a CloudFormation stack for cluster itself and 0 nodegroup stack(s)
2024-08-28 20:55:35 [ℹ] will create a CloudFormation stack for cluster itself and 1 managed nodegroup stack(s)
2024-08-28 20:55:35 [ℹ] if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=ap-southeast-2 --cluster=tseldemo'
2024-08-28 20:55:35 [ℹ] Kubernetes API endpoint access will use default of {publicAccess=true, privateAccess=false} for cluster "tseldemo" in "ap-southeast-2"
2024-08-28 20:55:35 [ℹ] CloudWatch logging will not be enabled for cluster "tseldemo" in "ap-southeast-2"
2024-08-28 20:55:35 [ℹ] you can enable it with 'eksctl utils update-cluster-logging --enable-types={SPECIFY-YOUR-LOG-TYPES-HERE (e.g. all)} --region=ap-southeast-2 --cluster=tseldemo'
2024-08-28 20:55:35 [ℹ]
2 sequential tasks: { create cluster control plane "tseldemo",
2 sequential sub-tasks: {
5 sequential sub-tasks: {
1 task: { create addons },
wait for control plane to become ready,
associate IAM OIDC provider,
no tasks,
update VPC CNI to use IRSA if required,
},
create managed nodegroup "nodegroup1",
}
}
2024-08-28 20:55:35 [ℹ] building cluster stack "eksctl-tseldemo-cluster"
2024-08-28 20:55:37 [ℹ] deploying stack "eksctl-tseldemo-cluster"
2024-08-28 20:56:07 [ℹ] waiting for CloudFormation stack "eksctl-tseldemo-cluster"
2024-08-28 20:56:38 [ℹ] waiting for CloudFormation stack "eksctl-tseldemo-cluster"
2024-08-28 20:57:38 [ℹ] waiting for CloudFormation stack "eksctl-tseldemo-cluster"
2024-08-28 20:58:39 [ℹ] waiting for CloudFormation stack "eksctl-tseldemo-cluster"
2024-08-28 20:59:39 [ℹ] waiting for CloudFormation stack "eksctl-tseldemo-cluster"
2024-08-28 21:00:41 [ℹ] waiting for CloudFormation stack "eksctl-tseldemo-cluster"
2024-08-28 21:01:41 [ℹ] waiting for CloudFormation stack "eksctl-tseldemo-cluster"
2024-08-28 21:02:42 [ℹ] waiting for CloudFormation stack "eksctl-tseldemo-cluster"
2024-08-28 21:03:43 [ℹ] waiting for CloudFormation stack "eksctl-tseldemo-cluster"
2024-08-28 21:03:48 [ℹ] creating addon
2024-08-28 21:03:48 [ℹ] successfully created addon
2024-08-28 21:03:49 [ℹ] "addonsConfig.autoApplyPodIdentityAssociations" is set to true; will lookup recommended pod identity configuration for "vpc-cni" addon
2024-08-28 21:03:49 [ℹ] deploying stack "eksctl-tseldemo-addon-vpc-cni-podidentityrole-aws-node"
2024-08-28 21:03:49 [ℹ] waiting for CloudFormation stack "eksctl-tseldemo-addon-vpc-cni-podidentityrole-aws-node"
2024-08-28 21:04:20 [ℹ] waiting for CloudFormation stack "eksctl-tseldemo-addon-vpc-cni-podidentityrole-aws-node"
2024-08-28 21:04:20 [ℹ] creating addon
2024-08-28 21:04:22 [ℹ] successfully created addon
2024-08-28 21:04:23 [ℹ] creating addon
2024-08-28 21:04:23 [ℹ] successfully created addon
2024-08-28 21:04:24 [ℹ] creating addon
2024-08-28 21:04:25 [ℹ] successfully created addon
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x20 pc=0x105ad8d58]
goroutine 189 [running]:
github.com/weaveworks/eksctl/pkg/actions/addon.(*Manager).Update(0x14000578640, {0x10816d3a8, 0x10a8db2e0}, 0x1400033ed20, {0x0, 0x0}, 0x15d3ef79800)
github.com/weaveworks/eksctl/pkg/actions/addon/update.go:121 +0xeb8
github.com/weaveworks/eksctl/pkg/actions/addon.CreateAddonTasks.func3()
github.com/weaveworks/eksctl/pkg/actions/addon/tasks.go:111 +0x90
github.com/weaveworks/eksctl/pkg/utils/tasks.(*GenericTask).Do(0x14000807158, 0x0?)
github.com/weaveworks/eksctl/pkg/utils/tasks/tasks.go:31 +0x34
github.com/weaveworks/eksctl/pkg/utils/tasks.doSingleTask(0x14000186960?, {0x10811d000, 0x14000807158})
github.com/weaveworks/eksctl/pkg/utils/tasks/tasks.go:202 +0xc8
github.com/weaveworks/eksctl/pkg/utils/tasks.doSequentialTasks(0x0?, {0x1400050c880, 0x5, 0x0?})
github.com/weaveworks/eksctl/pkg/utils/tasks/tasks.go:250 +0x6c
created by github.com/weaveworks/eksctl/pkg/utils/tasks.(*TaskTree).Do in goroutine 187
github.com/weaveworks/eksctl/pkg/utils/tasks/tasks.go:158 +0x258
How to reproduce it?
Create the cluster using eksctl create cluster -f cluster-SYD.yaml, the YAML file as follow
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: tseldemo
region: ap-southeast-2
version: latest
tags:
karpenter.sh/discovery: tseldemo
karpenter:
version: '1.0.1'
createServiceAccount: true
managedNodeGroups:
- name: nodegroup1
instanceType: m6i.large
privateNetworking: true
desiredCapacity: 1
iam:
withAddonPolicies:
albIngress: true
autoScaler: true
cloudWatch: true
ebs: true
efs: true
fsx: true
imageBuilder: true
xRay: true
awsLoadBalancerController: true
iam:
withOIDC: true
vpcResourceControllerPolicy: true
addons:
- name: vpc-cni
version: latest
useDefaultPodIdentityAssociations: true
- name: kube-proxy
version: latest
useDefaultPodIdentityAssociations: true
- name: coredns
version: latest
useDefaultPodIdentityAssociations: true
- name: aws-efs-csi-driver
version: latest
useDefaultPodIdentityAssociations: true
- name: eks-pod-identity-agent
version: latest
Logs
Anything else we need to know?
I use eksctl in MacOS 14.6.1, install it view homebrew.
Versions
$ eksctl info
eksctl version: 0.189.0
kubectl version: v1.29.1
OS: darwin
@ttirtawi, we have identified the issue and will work on a fix soon. In the meantime, I'd recommend working around this by removing iam.withOIDC as you do not seem to be using IRSA.
@ttirtawi, we have identified the issue and will work on a fix soon. In the meantime, I'd recommend working around this by removing
iam.withOIDCas you do not seem to be using IRSA.
I encountered the same crash and can confirm that this workaround works. :+1:
However, I think there might be a related documentation issue on this page. The following sentence led me to assume that IRSA/OIDC was still required, so maybe it could be clarified:
Pod Identity Association leverages IRSA, however, it makes it configurable directly through EKS API, eliminating the need for using IAM API altogether.
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue was closed because it has been stalled for 5 days with no activity.