bug(nodeadm): Unable to create nodepool with RAID0 localstorage strategy in version >= v20250620
What happened:
We are using the local NVMe SSDs available on i4i.xlarge instances as ephemeral storage for our Kubernetes workloads, specifically for an ElasticSearch deployment that requires low-latency I/O. This is achieved using the localStorage: RAID0 strategy within the node.eks.aws/v1alpha1 NodeConfig manifest, which is passed via user data to the EKS-optimized Amazon Linux 2023 (AL2023) AMI.
A previously successful nodepool, configured with the AMI version 1.31.7-20250610, failed to upgrade when we attempted to change its release version to 1.31.7-20250620. The nodepool creation process became stuck and eventually failed. We also tried a fresh deployment with the new AMI, which exhibited the same failure.
What you expected to happen: The nodepool upgrade should have successfully completed, and the new nodes should have joined the cluster with their local NVMe SSDs configured as a RAID0 volume, coming up as ephemeral storage for the kubelet. creation of new nodepool should have also passed successfully with nodes recoginizing local SSD as ephemeral storage
How to reproduce it (as minimally and precisely as possible):
-
Use the provided Terraform configuration to create a launch template and a managed EKS nodepool with
i4i.xlargeinstances andami_type = "AL2023_x86_64_STANDARD". -
Set the initial
release_versionto1.31.7-20250610. -
Observe that the nodepool successfully launches and the nodes join the cluster. You can confirm that node is able to recogonize and use Local SSD as Pod Ephemeral storage
-
Update the
release_versionto>=1.31.7-20250620and apply the changes. -
Observe that the new nodes fail to initialize and get stuck in the Creating state indefinitely before eventually failing.
nodepool.tf
resource "aws_eks_node_group" "al23_raid0_1_31_7-20250610" {
cluster_name = data.aws_eks_cluster.atom_eks.name
node_group_name = "al23-raid0-1-31-7-20250610"
node_role_arn = var.node_pool_role_arn
subnet_ids = var.atom_private_subnet_ids
instance_types = ["i4i.xlarge"]
ami_type = "AL2023_x86_64_STANDARD"
release_version = "1.31.7-20250610"
launch_template {
id = aws_launch_template.al23_raid0.id
version = aws_launch_template.al23_raid0.latest_version
}
scaling_config {
desired_size = 2
max_size = 8
min_size = 2
}
lifecycle {
ignore_changes = [scaling_config[0].desired_size]
create_before_destroy = true
}
labels = {
"sage_es" = "true"
"raid0" = "v20250610"
}
taint {
key = "dedicated"
value = "sage_es"
effect = "NO_SCHEDULE"
}
# Required for local disk RAID0
update_config {
max_unavailable = 1
}
}
resource "aws_launch_template" "al23_raid0" {
block_device_mappings {
device_name = "/dev/xvda"
ebs {
volume_size = 50
}
}
user_data = base64encode(templatefile("${path.module}/i4i_nvme_nodeadm.userdata_raid0.tftpl", {
CLUSTER_NAME = data.aws_eks_cluster.atom_eks.name
API_SERVER_ENDPOINT = data.aws_eks_cluster.atom_eks.endpoint
CA_AUTHORITY_B64 = data.aws_eks_cluster.atom_eks.certificate_authority[0].data
CLUSTER_CIDR = data.aws_eks_cluster.atom_eks.kubernetes_network_config[0].service_ipv4_cidr
}))
metadata_options {
http_endpoint = "enabled"
http_tokens = "required"
http_put_response_hop_limit = 1
instance_metadata_tags = "enabled"
}
vpc_security_group_ids = [var.atom_security_group_id]
}
i4i_nvme_nodeadm.userdata_raid0.tftpl
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="BOUNDARY"
--BOUNDARY
Content-Type: application/node.eks.aws
---
apiVersion: node.eks.aws/v1alpha1
kind: NodeConfig
spec:
cluster:
name: ${CLUSTER_NAME}
apiServerEndpoint: ${API_SERVER_ENDPOINT}
certificateAuthority: ${CA_AUTHORITY_B64}
cidr: ${CLUSTER_CIDR}
instance:
localStorage:
strategy: RAID0
--BOUNDARY--
there is a comment on a similar-older issue which is also worth referencing: https://github.com/awslabs/amazon-eks-ami/issues/2122#issuecomment-2904174482
Environment:
- AWS Region: us-west-2
- Instance Type(s): i4i.xlarge
- Cluster Kubernetes version: 1.31
- Node Kubernetes version: 1.31
- AMI Version: 1.31.7-20250610
unfortunately I wasn't able to repro using these assets with a 1.31 cluster i created. nodes landed on amazon-eks-node-al2023-x86_64-standard-1.31-v20250620 and were still able to join the cluster. I could be doing something wrong, but I'm not seeing a bug in the actual code path that might explain any regression
get stuck in the Creating state indefinitely
are you able to get nodeadm's logs (journalctl -u nodeadm-config -u nodeadm-run) on any of the nodes that get stuck? I'm able to see that the disk setup itself was successful.
Disk Setup Logs
Aug 20 17:19:43 ip-192-168-158-88.us-west-2.compute.internal nodeadm[2152]: {"level":"info","ts":1755710383.698419,"caller":"init/init.go:114","msg":"Setting up system aspects..."}
Aug 20 17:19:43 ip-192-168-158-88.us-west-2.compute.internal nodeadm[2152]: {"level":"info","ts":1755710383.698438,"caller":"init/init.go:117","msg":"Setting up system aspect..","name":"local-disk"}
Aug 20 17:19:43 ip-192-168-158-88.us-west-2.compute.internal nodeadm[2179]: mdadm: chunk size defaults to 512K
Aug 20 17:19:43 ip-192-168-158-88.us-west-2.compute.internal nodeadm[2179]: mdadm: Defaulting to version 1.2 metadata
Aug 20 17:19:43 ip-192-168-158-88.us-west-2.compute.internal nodeadm[2179]: mdadm: array /dev/md/kubernetes started.
Aug 20 17:19:43 ip-192-168-158-88.us-west-2.compute.internal nodeadm[2196]: meta-data=/dev/md/kubernetes isize=512 agcount=32, agsize=7147776 blks
Aug 20 17:19:43 ip-192-168-158-88.us-west-2.compute.internal nodeadm[2196]: = sectsz=512 attr=2, projid32bit=1
Aug 20 17:19:43 ip-192-168-158-88.us-west-2.compute.internal nodeadm[2196]: = crc=1 finobt=1, sparse=1, rmapbt=0
Aug 20 17:19:43 ip-192-168-158-88.us-west-2.compute.internal nodeadm[2196]: = reflink=1 bigtime=1 inobtcount=1
Aug 20 17:19:43 ip-192-168-158-88.us-west-2.compute.internal nodeadm[2196]: data = bsize=4096 blocks=228726656, imaxpct=25
Aug 20 17:19:43 ip-192-168-158-88.us-west-2.compute.internal nodeadm[2196]: = sunit=128 swidth=128 blks
Aug 20 17:19:43 ip-192-168-158-88.us-west-2.compute.internal nodeadm[2196]: naming =version 2 bsize=4096 ascii-ci=0, ftype=1
Aug 20 17:19:43 ip-192-168-158-88.us-west-2.compute.internal nodeadm[2196]: log =internal log bsize=4096 blocks=111688, version=2
Aug 20 17:19:43 ip-192-168-158-88.us-west-2.compute.internal nodeadm[2196]: = sectsz=512 sunit=8 blks, lazy-count=1
Aug 20 17:19:43 ip-192-168-158-88.us-west-2.compute.internal nodeadm[2196]: realtime =none extsz=4096 blocks=0, rtextents=0
Aug 20 17:19:44 ip-192-168-158-88.us-west-2.compute.internal systemctl[2205]: Created symlink /etc/systemd/system/multi-user.target.wants/mnt-k8s\x2ddisks-0.mount → /etc/systemd/system/mnt-k8s\x2ddisks-0.mount.
Aug 20 17:19:44 ip-192-168-158-88.us-west-2.compute.internal nodeadm[2171]: Copying /var/lib/kubelet/ to /mnt/k8s-disks/0/kubelet/
Aug 20 17:19:44 ip-192-168-158-88.us-west-2.compute.internal systemctl[2259]: Created symlink /etc/systemd/system/multi-user.target.wants/var-lib-kubelet.mount → /etc/systemd/system/var-lib-kubelet.mount.
Aug 20 17:19:44 ip-192-168-158-88.us-west-2.compute.internal nodeadm[2171]: Copying /var/lib/containerd/ to /mnt/k8s-disks/0/containerd/
Aug 20 17:19:44 ip-192-168-158-88.us-west-2.compute.internal systemctl[2291]: Created symlink /etc/systemd/system/multi-user.target.wants/var-lib-containerd.mount → /etc/systemd/system/var-lib-containerd.mount.
Aug 20 17:19:44 ip-192-168-158-88.us-west-2.compute.internal nodeadm[2171]: Copying /var/log/pods/ to /mnt/k8s-disks/0/pods/
Aug 20 17:19:45 ip-192-168-158-88.us-west-2.compute.internal systemctl[2331]: Created symlink /etc/systemd/system/multi-user.target.wants/var-log-pods.mount → /etc/systemd/system/var-log-pods.mount.
Aug 20 17:19:45 ip-192-168-158-88.us-west-2.compute.internal nodeadm[2171]: Successfully setup RAID-0 consisting of /dev/nvme1n1
Aug 20 17:19:45 ip-192-168-158-88.us-west-2.compute.internal nodeadm[2152]: {"level":"info","ts":1755710385.2556493,"caller":"init/init.go:121","msg":"Set up system aspect","name":"local-disk"}
This issue is stale because it has been open for 60 days with no activity. Remove the stale label or comment to avoid closure in 14 days