kustomize-controller icon indicating copy to clipboard operation
kustomize-controller copied to clipboard

Health check timed out although helm release is deployed successfully

Open yildizbilal opened this issue 3 years ago • 6 comments

Situation We have a kustomization that deploys a helmrelease named httpbin but it shows following error, although deployment was successfully:

Health check failed after 2m0.013489487s, timeout waiting for: [HelmRelease/wpb-services/httpbin status: 'InProgress']

I tried to suspend and resume the kustomization, but i get the same error again. Neither the deployment nor the helmrelease logs any error. Couldn't figured out where the InProgress cames from.

helm status output:

$ helm status httpbin -n wpb-services
NAME: httpbin
LAST DEPLOYED: Wed Aug  3 10:09:50 2022
NAMESPACE: wpb-services
STATUS: deployed
REVISION: 1
TEST SUITE: None

Versions

$ flux version
flux: v0.31.5
helm-controller: v0.22.2
image-automation-controller: v0.23.5
image-reflector-controller: v0.19.4
kustomize-controller: v0.26.3
notification-controller: v0.24.1
source-controller: v0.25.11

Kubernetes: v1.22.6

Expected Behaviour Kustomizations READY flag should be set to True and the STATUS should show the applied revision.

yildizbilal avatar Aug 15 '22 14:08 yildizbilal

Please post here flux get helmrelease httpbin -n wpb-services

stefanprodan avatar Aug 15 '22 14:08 stefanprodan

@stefanprodan

$ flux get helmrelease httpbin -n wpb-services
NAME   	REVISION	SUSPENDED	READY	MESSAGE                          
httpbin	2.7.3   	True     	True 	Release reconciliation succeeded

Im not sure why SUSPENDED is set to True. I resumed the helmrelease with flux resume hr httpbin -n wpb-services:

$ flux get helmrelease httpbin -n wpb-services
NAME   	REVISION	SUSPENDED	READY	MESSAGE                          
httpbin	2.7.3   	False     	True 	Release reconciliation succeeded

And then i reconciled the kustomization flux reconcile kustomization dev-playground but i get the same error. Events of the kustomization after reconcilation:

Normal   Progressing        2m37s  kustomize-controller  HelmRelease/wpb-services/httpbin configured                                                                                                                                                                     
Warning  HealthCheckFailed  37s    kustomize-controller  Health check failed after 2m0.023241841s, timeout waiting for: [HelmRelease/wpb-services/httpbin status: 'InProgress']

yildizbilal avatar Aug 16 '22 08:08 yildizbilal

Can you post here the Flux Kustomization YAML here please

stefanprodan avatar Aug 16 '22 08:08 stefanprodan

@stefanprodan


apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
  name: dev-playground
  namespace: flux-system
spec:
  interval: 10m0s
  timeout: 2m
  sourceRef:
    kind: GitRepository
    name: k8s-config
  path: ./dev-tools/playground
  dependsOn:
  - name: namespaces
  - name: istio
  - name: chart-sources
  prune: true
  suspend: false
  validation: client
  healthChecks:
  - apiVersion: helm.toolkit.fluxcd.io/v2beta1
    kind: HelmRelease
    name: grafana
    namespace: wpb-services
  - apiVersion: helm.toolkit.fluxcd.io/v2beta1
    kind: HelmRelease
    name: httpbin
    namespace: wpb-services



apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: httpbin
  namespace: wpb-services
spec:
  suspend: true
  releaseName: httpbin
  chart:
    spec:
      chart: ./charts/wpb-service
      sourceRef:
        name: k8s-config
        namespace: flux-system
        kind: GitRepository
  interval: 10m
  timeout: 1m
  install:
    remediation:
      retries: 0
  upgrade:
    remediation:
      retries: 0
  values:
    common:
      service:
        name: httpbin
      port: 8080
    shared:
      ingress:
        routes:
        - uri:
            prefix: /*
          cors:
            enabled: true
            allowedOrigins:
            - .*
    release:
      image: mccutchen/go-httpbin:latest

yildizbilal avatar Aug 16 '22 08:08 yildizbilal

Can you please replace the healthChecks with wait: true and see if the health check passes.

stefanprodan avatar Aug 16 '22 09:08 stefanprodan

@stefanprodan Same error appears with wait:true.

yildizbilal avatar Aug 16 '22 09:08 yildizbilal

@yildizbilal Please can you post the status of the helm release? Run kubectl get hr httpbin -n wbpservices -oyaml and paste the status part

somtochiama avatar Aug 16 '22 11:08 somtochiama

@somtochiama

status:
  conditions:
  - lastTransitionTime: "2022-08-16T08:46:45Z"
    message: Release reconciliation succeeded
    reason: ReconciliationSucceeded
    status: "True"
    type: Ready
  - lastTransitionTime: "2022-08-16T08:46:45Z"
    message: Helm upgrade succeeded
    reason: UpgradeSucceeded
    status: "True"
    type: Released
  helmChart: flux-system/wpb-services-httpbin
  lastAppliedRevision: 2.7.3
  lastAttemptedRevision: 2.7.3
  lastAttemptedValuesChecksum: 0768110ce556fb1611b70804b11285a7ad4bf963
  lastReleaseRevision: 2
  observedGeneration: 6

yildizbilal avatar Aug 16 '22 11:08 yildizbilal

@yildizbilal Please confirm that the helm release isn't still suspended and post the full YAML output from the kubectl command.

An object is marked as InProgress when metadata.generation differs from observedGeneration in the status and this is the case when the helmrelease is suspended.

If the helm release is still suspended, run:

flux get hr httpbin -n wbpservices
flux reconcile ks dev-playground -n flux-system

somtochiama avatar Aug 16 '22 16:08 somtochiama

I dont know why, but the httpbin helmrelease was suspended again. How and when can be a HelmRelease be suspended automatically?

NAME   	REVISION	SUSPENDED	READY	MESSAGE                          
httpbin	2.7.3   	True     	True 	Release reconciliation succeeded

Unfortunately reconcilation doesn't helps.

$ flux reconcile ks dev-playground
► annotating Kustomization dev-playground in flux-system namespace
✔ Kustomization annotated
◎ waiting for Kustomization reconciliation
✗ Kustomization reconciliation failed: Health check failed after 2m0.020266338s, timeout waiting for: [HelmRelease/wpb-services/httpbin status: 'InProgress']
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  creationTimestamp: "2022-05-30T08:38:42Z"
  finalizers:
  - finalizers.fluxcd.io
  generation: 7
  labels:
    kustomize.toolkit.fluxcd.io/name: dev-playground
    kustomize.toolkit.fluxcd.io/namespace: flux-system
  name: httpbin
  namespace: wpb-services
  resourceVersion: "298998308"
  uid: efc6d5a0-27b7-45ff-aac4-bef9286f52a0
spec:
  chart:
    spec:
      chart: ./charts/wpb-service
      reconcileStrategy: ChartVersion
      sourceRef:
        kind: GitRepository
        name: k8s-config
        namespace: flux-system
      version: '*'
  install:
    remediation:
      retries: 0
  interval: 10m
  releaseName: httpbin
  suspend: true
  timeout: 1m
  upgrade:
    remediation:
      retries: 0
  values:
    common:
      port: 8080
      service:
        name: httpbin
    release:
      image: mccutchen/go-httpbin:latest
    shared:
      ingress:
        routes:
        - cors:
            allowedOrigins:
            - .*
            enabled: true
          uri:
            prefix: /*
status:
  conditions:
  - lastTransitionTime: "2022-08-16T08:46:45Z"
    message: Release reconciliation succeeded
    reason: ReconciliationSucceeded
    status: "True"
    type: Ready
  - lastTransitionTime: "2022-08-16T08:46:45Z"
    message: Helm upgrade succeeded
    reason: UpgradeSucceeded
    status: "True"
    type: Released
  helmChart: flux-system/wpb-services-httpbin
  lastAppliedRevision: 2.7.3
  lastAttemptedRevision: 2.7.3
  lastAttemptedValuesChecksum: 0768110ce556fb1611b70804b11285a7ad4bf963
  lastReleaseRevision: 2
  observedGeneration: 6


I also tried

flux resume hr httpbin -n wpb-services
flux reconcile ks dev-playground 

yildizbilal avatar Aug 17 '22 06:08 yildizbilal

I guess you have set suspend in Git, delete the field from there.

stefanprodan avatar Aug 17 '22 07:08 stefanprodan

Setting suspend: false for httpbin helmrelease solved the problem. But shouldn't the health check pass also with suspend: true?

yildizbilal avatar Aug 17 '22 07:08 yildizbilal

But shouldn't the health check pass also with suspend: true?

It does, unless you change the spec of the HR then it's marked as stale.

stefanprodan avatar Aug 17 '22 10:08 stefanprodan