fluent-operator icon indicating copy to clipboard operation
fluent-operator copied to clipboard

bug: fluent-operator breaks if a secret in a namespace is not available for the output

Open alternaivan opened this issue 3 months ago • 3 comments

Describe the issue

Hello,

We have outputs in multiple namespaces, and in one of those namespaces, the secret defined in the output wasn't available. This has caused fluent-bit daemon sets to go into CrashLoopBackOff and to not properly start.

There were multiple errors in the operator log where it cannot find the secrets in all namespaces, even though the secrets were present in all of them, except one e.g. test-01:

2025-10-14T15:09:27Z  ERROR  Reconciler error  {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "FluentBit": {"name":"fluent-bit-config","namespace":"test-01"}, "namespace": test-01", "name": "fluent-bit-config", "reconcileID": "516abca7-9351-4f88-8b30-ff54fadf9c16", "error": "Secret \"elasticsearch-credentials\" not found"}
2025-10-14T15:09:27Z  ERROR  Reconciler error  {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "FluentBit": {"name":"fluent-bit-config","namespace":"test-02"}, "namespace": "test-02", "name": "fluent-bit-config", "reconcileID": "4feafd59-972e-444d-a861-b36185091492", "error": "Secret \"elasticsearch-credentials\" not found"}
2025-10-14T15:09:27Z  ERROR  Reconciler error  {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "FluentBit": {"name":"fluent-bit-config","namespace":"test-03"}, "namespace": "test-03", "name": "fluent-bit-config", "reconcileID": "850d08bf-a5d8-4109-afc6-1bd6dcd2bf07", "error": "Secret \"elasticsearch-credentials\" not found"}

Re-creating the missing secret in the namespace test-01 fixed the issue.

Thanks, Marjan

To Reproduce

  • Create one output per namespace (e.g. more than 3)
  • Create the secret in each namespace that is used by the output
  • Watch logs being collected
  • Delete a secret from one of the namespaces
  • Check the logs in the fluent-operator
  • Restart one of the fluent-bit daemon sets

Expected behavior

When a secret for the output is not available, I expect the fluent-operator to report it as an error, but not to crash the fluent-bit instances.

Your Environment

- Fluent Operator version: 3.4.0
- Container Runtime: containerd 
- Operating system:
- Kernel version:

How did you install fluent operator?

Via Helm.

Additional context

No response

alternaivan avatar Oct 15 '25 07:10 alternaivan

@alternaivan Thanks for the report. Would you be willing to work on a pull request for this issue?

joshuabaird avatar Nov 11 '25 16:11 joshuabaird

This more like a feature request from my view, what you expect is to validate the new generated config file before use it. Do I get it correctly @alternaivan

cw-Guo avatar Nov 14 '25 06:11 cw-Guo

Hi @joshuabaird and @cw-Guo,

I'm not sure if I could work on a PR for this issue, to be honest. It would take a bit more time to get in the code. :/

I wouldn't say it's a feature request, since the missing secret is causing all the deamon sets to crash. Maybe I'm seeing it differently?

Thanks, Marjan

alternaivan avatar Nov 14 '25 07:11 alternaivan