operator-sdk 1.20.0 breaks k8s_status in FIPS enabled OpenShift cluster
Bug Report
What did you do?
I have an Ansible operator image based on quay.io/operator-framework/ansible-operator:v1.19.1 which adds the kubernetes.core:2.3.0 and operator_sdk.util:0.4.0 collections in requirements.yaml. One of the playbook tasks sets the status of a CR like so:
- name: Set status to {{ status }} for {{ ansible_operator_meta.name }} in {{ ansible_operator_meta.namespace }}
k8s_status:
api_version: "acme.com/v1beta1"
kind: AcmeThing
name: "{{ ansible_operator_meta.name }}"
namespace: "{{ ansible_operator_meta.namespace }}"
status:
acmeStatus: "{{ status }}"
acmeVersion: "{{ version | default(omit) }}"
register: set_cr_status
retries: 3
delay: 5
until: set_cr_status is not failed
This works just fine on my FIPS-enabled OCP 4.8 cluster.
What did you expect to see?
When I change the base image to ansible-operator:v1.20.0 it continues to work.
What did you see instead? Under which circumstances?
When I change the base image to ansible-operator:v1.20.0 task k8s_status fails:
fatal: [localhost]: FAILED! => {"attempts": 3, "changed": false, "error": "[digital envelope routines: EVP_DigestInit_ex] disabled for FIPS", "msg": "Failed to get client due to %s"}
Environment
Operator type:
/language ansible
Kubernetes cluster type:
OpenShift 4.8.39
$ operator-sdk version
operator-sdk version: "v1.20.0", commit: "deb3531ae20a5805b7ee30b71f13792b80bd49b1", kubernetes version: "1.23", go version: "go1.17.9", GOOS: "linux", GOARCH: "amd64"
$ go version (if language is Go)
$ kubectl version
$ oc version
Client Version: 4.8.36
Server Version: 4.8.39
Kubernetes Version: v1.21.8+ed4d8fd
Possible Solution
The problem seems to be related to using MD5 hashes which are restricted in FIPS mode, compare https://github.com/s3tools/s3cmd/issues/1005#issuecomment-578241131.
Additional context
I patched my operator to run with ANSIBLE_VERBOSITY=3 and was able to gather the stack trace:
The full traceback is:
File "/tmp/ansible_k8s_status_payload_bi0wnjm8/ansible_k8s_status_payload.zip/ansible_collections/operator_sdk/util/plugins/module_utils/api_utils.py", line 86, in get_api_client
client = DynamicClient(kubernetes.client.ApiClient(configuration))
File "/usr/local/lib/python3.8/site-packages/openshift/dynamic/client.py", line 40, in __init__
K8sDynamicClient.__init__(self, client, cache_file=cache_file, discoverer=discoverer)
File "/usr/local/lib/python3.8/site-packages/kubernetes/dynamic/client.py", line 84, in __init__
self.__discoverer = discoverer(self, cache_file)
File "/usr/local/lib/python3.8/site-packages/kubernetes/dynamic/discovery.py", line 224, in __init__
Discoverer.__init__(self, client, cache_file)
File "/usr/local/lib/python3.8/site-packages/kubernetes/dynamic/discovery.py", line 48, in __init__
default_cachefile_name = 'osrcp-{0}.json'.format(hashlib.md5(default_cache_id).hexdigest())
fatal: [localhost]: FAILED! => {
"attempts": 3,
"changed": false,
"error": "[digital envelope routines: EVP_DigestInit_ex] disabled for FIPS",
Comparing the pip freeze output for ansible-operator:v1.19.1 and ansible-operator:v1.20.0 the kubernetes version changed from 12.0.1 to 23.3.0. However, both seem to have the same code:
$ grep md5 /usr/local/lib/python3.8/site-packages/kubernetes/dynamic/discovery.py
default_cachefile_name = 'osrcp-{0}.json'.format(hashlib.md5(default_cache_id).hexdigest())
When I patch discovery.py in my operator's Dockerfile, it works:
&& ansible-galaxy collection install -r ${HOME}/requirements.yml \
&& site_packages=/usr/local/lib/python3.8/site-packages \
&& sed -i -e 's/hashlib.md5(default_cache_id)/hashlib.md5(default_cache_id, usedforsecurity=False)/' ${site_packages}/kubernetes/dynamic/discovery.py \
While it's still not clear to me which of the python package updates from 1.19.1 to 1.20.0 caused this, I think the proper fix here involves two steps:
- Update package kubernetes (tracked through https://github.com/kubernetes-client/python/issues/1851)
- [x] https://github.com/kubernetes-client/python/pull/1854
- [x] waiting for next release: https://github.com/kubernetes-client/python/releases/tag/v25.3.0
- Pull the updated package into operator-sdk (can be tracked through the subject issue)
- [x] https://github.com/operator-framework/operator-sdk/releases/tag/v1.26.0
The source code appears to be here: https://github.com/kubernetes-client/python/blob/2677e9c810b62a82e75e65d07e502d49ec74a551/kubernetes/base/dynamic/discovery.py#L48
I had observed a FIPS issue with python openshift package version 0.13.1
https://github.com/openshift/openshift-restclient-python/issues/427#issuecomment-1103702707
Looks like Ansible operator now uses openshift version 0.13.1
https://github.com/operator-framework/operator-sdk/commit/9bb14cc42e1bf1e3d769961f7ecb8b4c67012523
https://github.com/operator-framework/operator-sdk/blob/master/images/ansible-operator/Pipfile.lock#L265
With https://github.com/kubernetes-client/python/releases/tag/v25.3.0 released, the above patch in the operator's Dockerfile can be changed to:
&& pip3 install --no-cache-dir kubernetes~=25.3.0 \
&& ansible-galaxy collection install -r ${HOME}/requirements.yml \
@efussi
Thank you. Did a quick test by installing openshift==0.13.1 and it installed kubernetes==25.3.0 as a dependency which has the fix you had committed.
https://github.com/operator-framework/operator-sdk/releases/tag/v1.26.0 contains kubernetes 25.3.0 which has the fix.