helm-operator Pods not being recreated after ConfigMap/Secret checksum change

Describe the bug

Helm checksum annotation doesn't seem to be working when using flux & helm-operator. In a HelmRelease, when values that are linked to a ConfigMap/Secret change, the pod is not recreated even though the checksum has changed.

To Reproduce

Steps to reproduce the behaviour:

Install a Flux & Helm Operator (Configure Flux with a git repo)
In the git repo, create a HelmRelease to install a helm chart (the chart must have a checksum annotation)
Make a change to a ConfigMap/Secret value through the HelmRelease values, and synchronize Flux

Expected behavior

The pod should be recreated after the change of the checksum value in the deployment resource.

Logs

ts=2020-04-06T15:32:42.221995545Z caller=release.go:342 component=release release=mysql-server targetNamespace=mysql-server resource=mysql-server:helmrelease/mysql-server helmVersion=v2 info="performing dry-run upgrade to see if release has diverged" ts=2020-04-06T15:32:42.595443923Z caller=release.go:377 component=release release=mysql-server targetNamespace=mysql-server resource=mysql-server:helmrelease/mysql-server helmVersion=v2 info="no changes" action=skip

Additional context

Helm Operator version: 1.0.0-rc8
Flux version: 1.17.1
Helm version: v2.14.3
Kubernetes version: v1.14.1

Apr 06 '20 15:04 NidhalRouissi

I really need this in my project

Apr 17 '20 13:04 gitsto

Does this only happen for charts sourced from a Git source, or for any .spec.values change to a HelmRelease making use of a Git or Helm repository source?

Apr 20 '20 13:04 hiddeco

Does this only happen for charts sourced from a Git source, or for any .spec.values change to a HelmRelease making use of a Git or Helm repository source?

None actually ! Our git repository holds only HelmRelease files and Namespaces. Our charts are hosted on a Helm Repository

Here is an example :

mysql_hr.yaml (HelmRelease, hosted on git)

apiVersion: helm.fluxcd.io/v1
kind: HelmRelease
metadata:
  name: mysql-server
  namespace: mysql-server
  annotations:
    fluxcd.io/automated: "false"
spec:
  releaseName: mysql-server
  forceUpgrade: true
  chart:
    repository: https://[HELM_REPO_URL]/
    name: mysql-server
    version: 1.2.20200330051700
  values:
    mysqlRootPassword: FooBarPwd
    mysqlPassword: FooBarPwd

values.yaml

...
mysqlRootPassword: ""
mysqlPassword: ""
...

secret.yaml

{{- if not .Values.existingSecret }}
apiVersion: v1
kind: Secret
metadata:
  name: {{ template "mysql.fullname" . }}
  namespace: {{ .Release.Namespace }}
  labels:
    app: {{ template "mysql.fullname" . }}
    chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
    release: "{{ .Release.Name }}"
    heritage: "{{ .Release.Service }}"
type: Opaque
data:
  {{ if .Values.mysqlRootPassword }}
  mysql-root-password:  {{ .Values.mysqlRootPassword | b64enc | quote }}
  {{ else }}
  mysql-root-password: {{ randAlphaNum 10 | b64enc | quote }}
  {{ end }}
  {{ if .Values.mysqlPassword }}
  mysql-password:  {{ .Values.mysqlPassword | b64enc | quote }}
  {{ else }}
  mysql-password: {{ randAlphaNum 10 | b64enc | quote }}
  {{ end }}
{{- if .Values.ssl.enabled }}
{{ if .Values.ssl.certificates }}
{{- range .Values.ssl.certificates }}

deployement.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: {{.Release.Name }}-mysql-server
  namespace: {{ .Release.Namespace }}
  labels:
    app: {{ template "mysql.fullname" . }}
    chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
    release: "{{ .Release.Name }}"
    heritage: "{{ .Release.Service }}"
  annotations:
    checksum/config1: {{ include (print $.Template.BasePath "/configurationFiles-configmap.yaml") . | sha256sum }}
    checksum/config2: {{ include (print $.Template.BasePath "/docker-config.yaml") . | sha256sum }}
    checksum/config3: {{ include (print $.Template.BasePath "/initializationFiles-configmap.yaml") . | sha256sum }}
    checksum/config4: {{ include (print $.Template.BasePath "/secret.yaml") . | sha256sum }}
spec:
...

In this example, if if change the value of one of the passwords in the helm release, the checksum of the secret should change and force the pod de be recreated. But it's like the helm operator doesn't even perform an upgrade.

Any ideas ?

Apr 20 '20 16:04 NidhalRouissi

@NidhalRouissi are you by any chance aware of a lightweight publicly available Helm chart with a likewise structure that I could use to replicate this?

Apr 20 '20 16:04 hiddeco

@NidhalRouissi are you by any chance aware of a lightweight publicly available Helm chart with a likewise structure that I could use to replicate this?

@hiddeco sorry for this long delay.

I could find a simple lightweight helm chart indeed. Unbound, a caching DNS resolver.

This is how to reproduce:

Let's suppose you'll install the chart via helm: helm upgrade --install unbound --version 1.1.2 stable/unbound --namespace unbound

Then you check that the pod is running kubectl -n unbound get pods

If you run an upgrade with the same parameters nothing will change, the pods will keep running, but if you change a certain value : helm upgrade --install unbound --version 1.1.2 stable/unbound --namespace unbound --set unbound.numThreads=2

This will trigger a pod recreation because that value is linked to a configmap, and there is a checksum on that configmap itself in the deployment manifest https://github.com/helm/charts/blob/master/stable/unbound/templates/deployment.yaml#L28-L29

annotations:
  checksum/config: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }}

But, if you change that same value directly via kubectl for example, nothing will happen: kubectl -n unbound edit cm unbound-unbound

I am assuming that HelmOperator is not running a "helm upgrade" command whenever i change a value in my HelmRelease manifest, this is why my pods are not recreated. If it's the case, HelmOperator is breaking a natively supported helm feature i guess..

May 18 '20 15:05 NidhalRouissi

@hiddeco were you able to reproduce the issue ? Please let me know if you need more details

Jun 12 '20 13:06 NidhalRouissi

@hiddeco @stefanprodan Could you please give us an update about this issue? Were you able to reproduce it? Any idea about root cause or a workaround waiting a fix? Thanks a lot

Aug 05 '20 16:08 kammous

Not sure if this is the point here, but putting the annotation on the deployment itself (instead of in the podSpec template of the deployment) might not cause pods to be recreated. The proper place for the checksum annotations can be seen here: https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments

Might be completely missing the point but it caught my eye when browsing past...

Sep 28 '20 10:09 erpel

I agree with @erpel. I put the example provided by @NidhalRouissi into a HelmRelease file, configured fluxcd and helm-operator to synchronize a git repo but wasn't able to reproduce the problem. It works as expected, meaning, each time a ConfigMap changes, pods are restarted. @hiddeco @stefanprodan, this issue can be closed. Thanks

Sep 30 '20 09:09 kammous

I am facing the same issue with statefulsets. Any help would be appreciated.

Apr 11 '22 17:04 mraslam

Sorry if your issue remains unresolved. The Helm Operator is in maintenance mode, we recommend everybody upgrades to Flux v2 and Helm Controller.

A new release of Helm Operator is out this week, 1.4.4.

We will continue to support Helm Operator in maintenance mode for an indefinite period of time, and eventually archive this repository.

Please be aware that Flux v2 has a vibrant and active developer community who are actively working through minor releases and delivering new features on the way to General Availability for Flux v2.

In the mean time, this repo will still be monitored, but support is basically limited to migration issues only. I will have to close many issues today without reading them all in detail because of time constraints. If your issue is very important, you are welcome to reopen it, but due to staleness of all issues at this point a new report is more likely to be in order. Please open another issue if you have unresolved problems that prevent your migration in the appropriate Flux v2 repo.

Helm Operator releases will continue as possible for a limited time, as a courtesy for those who still cannot migrate yet, but these are strongly not recommended for ongoing production use as our strict adherence to semver backward compatibility guarantees limit many dependencies and we can only upgrade them so far without breaking compatibility. So there are likely known CVEs that cannot be resolved.

We recommend upgrading to Flux v2 which is actively maintained ASAP.

I am going to go ahead and close every issue at once today, Thanks for participating in Helm Operator and Flux! 💚 💙

Sep 02 '22 19:09 kingdonb