security-profiles-operator failed to set /proc/self/attr/keycreate

What happened:

Running into https://github.com/docker/for-linux/issues/983 on RHEL 8.3 nodes

dnf list --installed | grep container-selinux container-selinux.noarch 2:2.167.0-1.module_el8.4.0+942+d25aada8 @appstream

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

cat profilerecording.yaml

apiVersion: security-profiles-operator.x-k8s.io/v1alpha1
kind: ProfileRecording
metadata:
  name: test-recording
spec:
  kind: SelinuxProfile
  recorder: logs
  podSelector:
    matchLabels:
      app: my-app

cat workload.yaml

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
  labels:
    app: my-app
spec:
  containers:
    - name: nginx
      image: quay.io/security-profiles-operator/test-nginx:1.19.1
    - name: redis
      image: quay.io/security-profiles-operator/redis:6.2.1

k describe pod

 Normal   Scheduled                11m                  default-scheduler  Successfully assigned default/my-pod to selinux6
 Normal   SeccompProfileRecording  11m                  profilerecorder    Recording profiles
 Normal   Pulling                  11m                  kubelet            Pulling image "quay.io/security-profiles-operator/test-nginx:1.19.1"
 Normal   Pulling                  11m                  kubelet            Pulling image "quay.io/security-profiles-operator/redis:6.2.1"
 Normal   Pulled                   11m                  kubelet            Successfully pulled image "quay.io/security-profiles-operator/test-nginx:1.19.1" in 4.239996431s
 Normal   Pulled                   11m                  kubelet            Successfully pulled image "quay.io/security-profiles-operator/redis:6.2.1" in 3.835676528s
 Warning  BackOff                  11m (x2 over 11m)    kubelet            Back-off restarting failed container
 Normal   Pulled                   11m (x2 over 11m)    kubelet            Container image "quay.io/security-profiles-operator/test-nginx:1.19.1" already present on machine
 Normal   Created                  11m (x3 over 11m)    kubelet            Created container nginx
 Warning  Failed                   11m (x3 over 11m)    kubelet            Error: failed to create containerd task: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: failed to set /proc/self/attr/keycreate on procfs: write /proc/self/attr/keycreate: invalid argument: unknown
 Normal   Created                  11m (x3 over 11m)    kubelet            Created container redis
 Warning  Failed                   11m (x3 over 11m)    kubelet            Error: failed to create containerd task: OCI runtime create failed: container_linux.go:380: starting container process caused: **process_linux.go:545: container init caused: failed to set /proc/self/attr/keycreate on procfs: write /proc/self/attr/keycreate: invalid argument: unknown**
 Normal   Pulled                   11m (x2 over 11m)    kubelet            Container image "quay.io/security-profiles-operator/redis:6.2.1" already present on machine
 Warning  BackOff                  112s (x52 over 11m)  kubelet            Back-off restarting failed container

Environment:

Lab set up as detailed at https://gist.github.com/marcredhat/e0a18623c9bf465243eab1871bcd7237

Nov 12 '21 18:11 marcredhat

Can you check what SELinux context has the pod being recorded assigned? Does the context exist on the machines (semodule -l | grep ...)

Nov 14 '21 20:11 jhrozek

two more things: if the context exists, do also the contexts it inherits from (blockinherit) exist?

do you have access to any 8.4 or newer based systems? Does it also happen there?

Nov 14 '21 20:11 jhrozek

Oh and if the policy file exists on the machine but is not installed (and therefore not visible though semodule -l) then checking what does (semodule -i <path/to/module>) say would be interesting.

Nov 15 '21 21:11 jhrozek

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Feb 13 '22 22:02 k8s-triage-robot

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

May 15 '22 09:05 k8s-triage-robot

Hello @jhrozek I just got this issue as well. to answer your questions, I used the errorlogger example from this repo. I do not see it showing up with:

semodule -l | grep errorlogger

also i noticed in the operator yaml theres a profile for selinuxd cil, I don't see that either.

for your other question, I applied the securityContext manually, since it says SPO doesn't support profile bindings so it looks like this for my test pod:

 securityContext:
      seLinuxOptions:
        type: errorlogger.process

Oh and if the policy file exists on the machine

where does the operator put it? I can check.

i tried to look around the logs for spods and operators but I don't see anything regarding selinux or a failed attempt or anything like that. Let me know if there is any other info I can get to assist with this, we do alot of seccomp/selinux and really want to use this.

May 27 '22 18:05 perezjasonr

By the way I should add I didnt even attempt profile recording, all I did was try to make a profile via the CRD.

Jun 06 '22 13:06 perezjasonr

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Jul 06 '22 14:07 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Aug 05 '22 14:08 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen
Mark this issue or PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Sep 04 '22 14:09 k8s-triage-robot

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen

Mark this issue or PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sep 04 '22 14:09 k8s-ci-robot