bottlerocket Support Nitro Enclaves for storing ACM certificates used by Nginx/Apache

What I'd like: AWS EC2 instances with Nitro support a feature called "Nitro Enclaves", which greatly enhance security when doing crypto operations, as it stores private keys in a way that they are not accessible to generic processes running on the host.

Most importantly, it has an integration with AWS Certificate Manager that allows you to deploy ACM-managed certificates onto the instances for on-host TLS termination (see https://docs.aws.amazon.com/enclaves/latest/user/nitro-enclave-refapp.html)

This is a game-changing feature for anyone who uses Nginx/Apache Httpd to terminate TLS on their hosts, both for security (since there's no private keys on the file system) and for operations (since admins no longer need to deploy the private key to the system). It would be great if Bottlerocket could support this feature, as it would greatly simplify the process of managing TLS certificates for people who use these servers.

Software Developers may want to take advantage of Nitro Enclaves as well, but I suspect that most Bottlerocket customers will primarily be interested in the ACM integration.

Any alternatives you've considered: None

Aug 17 '23 21:08 thedevopsmachine

Thanks for cutting this issue @thedevopsmachine! This does seem like a great thing to look into.

After looking at this a bit, it looks like the https://github.com/aws/aws-nitro-enclaves-cli (or something that does the same thing) would be required to manage the enclaves in the OS. There are some bits we might need to figure out like https://github.com/aws/aws-nitro-enclaves-cli/blob/main/bootstrap/nitro-enclaves-allocator which uses shell. Nonetheless, getting the enclave management bits working in Bottlerocket would enable users to use something like https://github.com/aws/aws-nitro-enclaves-with-k8s to leverage them as well. There is a bit of design and engineering to be done to get the nitro-cli working in Bottlerocket.

We don't have this on our roadmap right now but we'll keep this as a feature request. Thanks again for cutting this issue.

Aug 17 '23 22:08 yeazelm

We use Bottlerocket on EKS and are looking for Nitro Enclave support specifically for the Nitro Enclaves with K8s functionality. Our use case is not TLS termination, but we'd love to see support for Nitro Enclaves in Bottlerocket.

Sep 19 '23 07:09 jalaziz

I did a bit more investigation into this issue. This is what I found, the first problem preventing aws-nitro-enclaves-cli from being easily installed into Bottlerocket is the nitro-enclaves-allocator which is typically installed via a systemd service. This is a pretty complex bash script which will not work on Bottlerocket due to the lack of shell. We have two options to work around this problem:

Rewrite this script in Rust (much of the rest of code is in Rust so this might be an aspirational goal for the project). Ideally this would be upstreamed into the nitro enclaves repo to enable all other consumers to use this instead of the shell script. If it isn't accepted upstream, Bottlerocket could maintain it as part of its codebase.
Find a way to run this service as a host container or DaemonSet. The primary problem is this tooling needs access to the /dev/nitro_enclaves device as well as rw access to /sys/module/nitro_enclaves/ as well as other potential places. This means adding changes to the podspec. I got this one working in a hacky way by sidestepping the normal /sys/ path which is ro by default. I did this by adding a /hostsys path as rw and changed the code in the shell script to use that path. This only sort of worked and needs a bit more playing around to get something working. The access to the device might involve a bit more work to make sure its done correctly, I need to learn a bit more on whether providing this to all containers would be suitable or not. In theory, you might be able to fully customize the podspec to work around all these problems with privileged: true but this may not be desirable for a production environment.

Either way when working around this, the https://github.com/aws/aws-nitro-enclaves-k8s-device-plugin doesn't seem to pick this up so there is probably a bit more work to dive into this code and figure out how to get it working to enable an "out of the box" working experience in EKS.

Jul 01 '24 23:07 yeazelm

I got it to work with a few hacks:

From the admin container (yes, too much privilege), I ran /usr/bin/nitro-enclaves-allocator
After the hugepages are configured, one must restart the kubelet otherwise it won't pick up the configurations applied (see this)

bash-4.2# /home/run.sh
Start allocating memory...
Started enclave with enclave-cid: 17, memory: 128 MiB, cpu-ids: [1, 5]
{
  "EnclaveName": "hello",
  "EnclaveID": "XXXXXXXXX",
  "ProcessID": 17,
  "EnclaveCID": 17,
  "NumberOfCPUs": 2,
  "CPUIDs": [
    1,
    5
  ],
  "MemoryMiB": 128
}
-------------------------------
Enclave ID is XXXXXXX
-------------------------------
[   1] Hello from the enclave side!

So it works in Bottlerocket :tada: ! However, the experience isn't great.

We could either allow bootstrap-containers to modify hugepages/CPU pools (there might be missing some capabilities and they already have access to all the devices in the host), so that the host is configured on boot (I don't think allowing to change the hugepages/CPU pools at runtime would safe, if there are already workloads using them). Or, we provide a sub-command like apiclient setup-enclaves <blah>, similar to the command that will be added here. With this, the experience will be similar to what EKS provides today (see this), where the enclaves are configured on boot before the kubelet runs and the workloads just work after they are deployed.

Jul 04 '24 23:07 arnaldo2792

One note, you have to install file as well as the nitro enclave cli to get it working in the admin container:

amazon-linux-extras install aws-nitro-enclaves-cli file -y

Otherwise you see the error:

[root@admin]# /usr/bin/nitro-enclaves-allocator
Auto-generating the enclave CPU pool by using the CPU count...
Will try to reserve 768 MB of memory on node 0.
Configuring the huge page memory...
/usr/bin/nitro-enclaves-allocator: line 188: file: command not found
Error: Failed to find NUMA node for a CPU. This indicates an invalid SysFS configuration.

Jul 08 '24 18:07 yeazelm

Hi @arnaldo2792 and @yeazelm , hope you're well these days. I'm new to the bottlerocket and want to explore how to integrate the Nitro Enclave functionality to the Bottlerocket OS. Wanna ask several questions:

Is it official supported in Bottlerocket OS? Cannot find any references on how to use it for now.
In @arnaldo2792 reply, it looks like it's still possible to add Nitro Enclave to Bottlerocket with some hacks, I want to know the inner magic here, and
- How to turn on the "Enclaves Support" feature in the EKS node? Cannot find any reference or related configuration in eksctl nodeGroup configuration to specify this option.
- How to setup this magical Bottlerocket node? Do we need to add the user data mentioned in Using Nitro Enclaves with Amazon EKS? or we set up the Bottlerocket node with "Enclaves Support" enabled first, then go to the admin container and install nitro-enclaves-allocator service manually?
- After above 2 questions, how can we verify it can execute the Nitro Enclave program and the other pods in Bottlerocket OS can communicate with this Nitro Enclave program?

Dec 04 '24 08:12 smilenow

Hi,

At the moment, we don’t have official support but know with some workarounds, you can get enclaves working.

I was able to follow the instructions here and get it working.

This method has a few caveats:

It uses superpowered host-containers which have too many privileges
It does not persist on reboot and would need a manual enabling of the host-container being used to setup nitro-enclaves

I have listed out the steps taken in detail to run a container in a enclave on a instance launched with nitro-enclaves enabled (either via CLI or in the launch template)

Step 1

Enabling hugepages on the instance

[ssm-user@control]$ apiclient get settings.kernel
{
  "settings": {
    "kernel": {
      "lockdown": "integrity",
      "sysctl": {
        "vm/nr_hugepages": "3520"
      }
    }
  }
}

Step 2

Deploy a host container to enable nitro-enclaves during boot

[ssm-user@control]$ apiclient get settings.host-containers.nitro-enclave-setup
{
  "settings": {
    "host-containers": {
      "nitro-enclave-setup": {
        "enabled": true,
        "source": "<ecr_url>/nitro-enclave-setup:v1",
        "superpowered": true
      }
    }
  }
}

The host-container image can be built like below:

Dockerfile

FROM public.ecr.aws/amazonlinux/amazonlinux:2023 AS base

RUN yum clean metadata
RUN yum install aws-nitro-enclaves-cli aws-nitro-enclaves-cli-devel file -y
COPY ./run.sh /
RUN chmod +x ./run.sh
ENTRYPOINT ["./run.sh"]

run.sh

#! /usr/bin/bash
/usr/bin/nitro-enclaves-allocator

# This is needed to prevent host-ctr from contantly restarting the container
apiclient set settings.host-containers.nitro-enclave-setup.enabled=false

Step 3

Follow Step 3 from this guide to setup kubernetes nitro-device-plugin and label the nodes With this, the node should be setup for Nitro Enclaves.

Tested with the hello container here.

Result:

$ kubectl logs hello-deployment-687559484c-qpv4k
Start allocating memory...
Started enclave with enclave-cid: 16, memory: 128 MiB, cpu-ids: [1, 5]
{
  "EnclaveName": "hello",
  "EnclaveID": "i-0424e2e297fd8a897-enc1939e5df03cc7dd",
  "ProcessID": 9,
  "EnclaveCID": 16,
  "NumberOfCPUs": 2,
  "CPUIDs": [
    1,
    5
  ],
  "MemoryMiB": 128
}
-------------------------------
Enclave ID is i-0424e2e297fd8a897-enc1939e5df03cc7dd
-------------------------------
Connecting to the console for enclave 16...
.
.
.
[    0.183470] NSM RNG: returning rand bytes = 64
[   1] Hello from the enclave side!
[   2] Hello from the enclave side!
[   3] Hello from the enclave side!

I hope this helps. I am going to take it as a feature request to enable this in a simpler manner (like via an apiclient command) as suggested earlier.

Jan 18 '25 00:01 vigh-m

I tried enabling Enclaves on EKS Auto Mode by doing the following:

I created a custom node pool, disabled the built-in node pools, and set the limits for my custom node pool to zero so it wouldn’t replace the instance after enabling hugepages and rebooting the instance (you can’t restart the kubelet service on an auto mode node).

After doing that, the enclave pod got scheduled onto the node (the image was pulled) but wouldn’t start.

I ran an instance of the admin container, installed the enclave CLI and ran the nitro-enclave-allocator but the container still fails to start.

I see the following in the kubelet logs:

May 20 01:31:29 ip-192-168-141-245.us-west-2.compute.internal kubelet[1320]: E0520 01:31:29.885692    1320 kuberuntime_manager.go:1258] container &Container{Name:hello-container,Image:820537372947.dkr.ecr.us-west-2.amazonaws.com/hello-66b4156c-641c-4620-bb01-097229ddc7ea,Command:[/home/run.sh],Args:[],WorkingDir:,Ports:[]ContainerPort{},Env:[]EnvVar{},Resources:ResourceRequirements{Limits:ResourceList{aws.ec2.nitro/nitro_enclaves: {{1 0} {<nil>} 1 DecimalSI},cpu: {{250 -3} {<nil>} 250m DecimalSI},hugepages-2Mi: {{536870912 0} {<nil>}  BinarySI},},Requests:ResourceList{aws.ec2.nitro/nitro_enclaves: {{1 0} {<nil>} 1 DecimalSI},cpu: {{250 -3} {<nil>} 250m DecimalSI},hugepages-2Mi: {{536870912 0} {<nil>}  BinarySI},},Claims:[]ResourceClaim{},},VolumeMounts:[]VolumeMount{VolumeMount{Name:hugepage,ReadOnly:false,MountPath:/dev/hugepages,SubPath:,MountPropagation:nil,SubPathExpr:,RecursiveReadOnly:nil,},VolumeMount{Name:kube-api-access-nhr4z,ReadOnly:true,MountPath:/var/run/secrets/kubernetes.io/serviceaccount,SubPath:,MountPropagation:nil,SubPathExpr:,RecursiveReadOnly:nil,},},LivenessProbe:nil,ReadinessProbe:nil,Lifecycle:nil,TerminationMessagePath:/dev/termination-log,ImagePullPolicy:Always,SecurityContext:nil,Stdin:false,StdinOnce:false,TTY:false,EnvFrom:[]EnvFromSource{},TerminationMessagePolicy:File,VolumeDevices:[]VolumeDevice{},StartupProbe:nil,ResizePolicy:[]ContainerResizePolicy{},RestartPolicy:nil,} start failed in pod hello-deployment-6488ffb989-cqwvc_default(484ccc73-b8fb-4c80-b0da-b95647133f76): CreateContainerError: failed to generate container "36b3044b5427272d17ba137994798bb9c8def2ee7be77621167ff020ecfaefc6" spec: failed to apply OCI options: lstat /dev/nitro_enclaves: no such file or directory
May 20 01:31:29 ip-192-168-141-245.us-west-2.compute.internal kubelet[1320]: E0520 01:31:29.885722    1320 pod_workers.go:1298] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"hello-container\" with CreateContainerError: \"failed to generate container \\\"36b3044b5427272d17ba137994798bb9c8def2ee7be77621167ff020ecfaefc6\\\" spec: failed to apply OCI options: lstat /dev/nitro_enclaves: no such file or directory\"" pod="default/hello-deployment-6488ffb989-cqwvc" podUID="484ccc73-b8fb-4c80-b0da-b95647133f76"

Looking at the host file system, there is no /dev/nitro_enclaves directory. The k8s nitro daemonset is installed and is in a "running" state (no errors).

May 20 '25 17:05 jicowan