Ready status when single pod can't start
I have found some weird behavior when testing the status behavior of HelmRelease.
The following setup should deploy the Helm charts podinfo and redis, both of which should fail as the tag foo does not exist for any of the images.
apiVersion: source.toolkit.fluxcd.io/v1alpha1
kind: HelmRepository
metadata:
name: podinfo
namespace: gitops-system
spec:
url: https://stefanprodan.github.io/podinfo
interval: 10m
---
apiVersion: helm.toolkit.fluxcd.io/v2alpha1
kind: HelmRelease
metadata:
name: frontend
namespace: gitops-system
spec:
targetNamespace: webapp
interval: 5m
chart:
spec:
chart: podinfo
version: '>=4.0.0 <5.0.0'
sourceRef:
kind: HelmRepository
name: podinfo
interval: 1m
values:
image:
tag: foo
---
apiVersion: source.toolkit.fluxcd.io/v1alpha1
kind: HelmRepository
metadata:
name: stable
namespace: gitops-system
spec:
url: https://kubernetes-charts.storage.googleapis.com/
interval: 10m
---
apiVersion: helm.toolkit.fluxcd.io/v2alpha1
kind: HelmRelease
metadata:
name: redis
namespace: gitops-system
spec:
targetNamespace: webapp
interval: 5m
chart:
spec:
chart: redis
sourceRef:
kind: HelmRepository
name: stable
interval: 1m
values:
image:
tag: foo
Both result in pods in a ImagePullBackOff state.
NAME READY STATUS RESTARTS AGE
webapp-frontend-podinfo-6694fbcbc4-rvjcn 0/1 ImagePullBackOff 0 6m32s
webapp-redis-master-0 0/1 ImagePullBackOff 0 4m59s
webapp-redis-slave-0 0/1 ImagePullBackOff 0 4m59s
Yet the podinfo HelmRelease ends up in a ready state which redis does not.
NAME READY STATUS AGE
frontend True release reconciliation succeeded 7m15s
redis False Helm install failed: timed out waiting for the condition 5m45s
I would expect both HelmReleases to not be in a ready state.
This is likely due to Helm's own behaviour for the --wait flag, and a Deployment only having a single replica. See: https://github.com/helm/helm/issues/5814#issuecomment-567130226
Proposed fix here:
https://github.com/helm/helm/pull/8671
Note: you may be able to work around it by setting maxUnavailable differently (or unsetting it).
Is it worth fixing before we get a new release of Helm with this fix? Health checks that now use kstatus are dependent on the status being properly set.
https://github.com/fluxcd/kustomize-controller/pull/101
@phillebaba there is no fix for this that we can do in fluxcd, this needs to be fixed upstream. We should document the Helm bug in our docs.