datadog-operator icon indicating copy to clipboard operation
datadog-operator copied to clipboard

Error when trying to deploy the cluster agent with no app key

Open arapulido opened this issue 5 years ago • 5 comments

Version of the operator: 0.4.0.

Describe what happened:

I tried to deploy the node agent and 1 replica of the cluster agent, without metrics server, but got the following error:

Status:
  Conditions:
    Last Transition Time:  2021-01-22T11:54:34Z
    Last Update Time:      2021-01-22T11:59:36Z
    Message:               secrets "datadog-secret" already exists
    Status:                True
    Type:                  ReconcileError
    Last Transition Time:  2021-01-22T11:54:34Z
    Last Update Time:      2021-01-22T11:59:34Z
    Message:               Datadog metrics forwarding error
    Status:                False
    Type:                  ActiveDatadogMetrics
Events:                    <none>

And logs:

2021-01-22T11:54:34.704Z	INFO	DatadogMetricForwarders	New Datadog metrics forwarder registred	{"ID": "default/datadog"}
2021-01-22T11:54:34.710Z	INFO	DatadogMetricForwarders	Starting Datadog metrics forwarder	{"CustomResource.Namespace": "default", "CustomResource.Name": "datadog"}
2021-01-22T11:54:34.713Z	INFO	DatadogMetricForwarders	Getting Datadog credentials	{"CustomResource.Namespace": "default", "CustomResource.Name": "datadog"}
2021-01-22T11:54:34.713Z	INFO	DatadogMetricForwarders	Got Datadog Site	{"CustomResource.Namespace": "default", "CustomResource.Name": "datadog", "site": "https://api.datadoghq.com"}
2021-01-22T11:54:34.713Z	ERROR	DatadogMetricForwarders	cannot get Datadog credentials,  will retry later...	{"CustomResource.Namespace": "default", "CustomResource.Name": "datadog", "error": "Secret \"datadog\" not found"}
2021-01-22T11:54:34.714Z	INFO	DatadogAgent	Reconciling DatadogAgent	{"Request.Namespace": "default", "Request.Name": "datadog"}
2021-01-22T11:54:34.714Z	INFO	DatadogAgent	Adding Finalizer for the DatadogAgent	{"Request.Namespace": "default", "Request.Name": "datadog"}
2021-01-22T11:54:34.734Z	INFO	DatadogAgent	Reconciling DatadogAgent	{"Request.Namespace": "default", "Request.Name": "datadog"}
2021-01-22T11:54:34.734Z	INFO	DatadogAgent	Defaulting values	{"Request.Namespace": "default", "Request.Name": "datadog"}
2021-01-22T11:54:34.764Z	INFO	DatadogAgent	Reconciling DatadogAgent	{"Request.Namespace": "default", "Request.Name": "datadog"}
2021-01-22T11:54:34.804Z	ERROR	controller-runtime.controller	Reconciler error	{"controller": "datadogdeployment-controller", "request": "default/datadog", "error": "secrets \"datadog-secret\" already exists"}
2021-01-22T11:54:35.805Z	INFO	DatadogAgent	Reconciling DatadogAgent	{"Request.Namespace": "default", "Request.Name": "datadog"}
2021-01-22T11:54:35.830Z	ERROR	controller-runtime.controller	Reconciler error	{"controller": "datadogdeployment-controller", "request": "default/datadog", "error": "secrets \"datadog-secret\" already exists"}
2021-01-22T11:54:36.830Z	INFO	DatadogAgent	Reconciling DatadogAgent	{"Request.Namespace": "default", "Request.Name": "datadog"}
2021-01-22T11:54:36.854Z	ERROR	controller-runtime.controller	Reconciler error	{"controller": "datadogdeployment-controller", "request": "default/datadog", "error": "secrets \"datadog-secret\" already exists"}
2021-01-22T11:54:37.855Z	INFO	DatadogAgent	Reconciling DatadogAgent	{"Request.Namespace": "default", "Request.Name": "datadog"}
2021-01-22T11:54:37.883Z	ERROR	controller-runtime.controller	Reconciler error	{"controller": "datadogdeployment-controller", "request": "default/datadog", "error": "secrets \"datadog-secret\" already exists"}

It feels that two issues are happening: the auth token secret is not created automatically, and also tries to create the datadog-secret secret (that I had already created)

Describe what you expected:

The agents are created correctly.

Steps to reproduce the issue:

Use datadog operator 0.4.0 with the following DatadogAgent definition:

apiVersion: datadoghq.com/v1alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  credentials:
    apiSecret:
      secretName: datadog-secret
      keyName: api-key
  agent:
    env:
      - name: DD_KUBELET_TLS_VERIFY
        value: "false"
    image:
      name: "datadog/agent:latest"
    apm:
      enabled: true
    process:
      enabled: true
    log:
      enabled: true
    config:
      tolerations:
        - key: node-role.kubernetes.io/master
          effect: NoSchedule
  clusterAgent:
    image:
      name: "datadog/cluster-agent:latest"
    config:
      clusterChecksEnabled: true
    replicas: 1

Additional environment details (Operating System, Cloud provider, etc):

arapulido avatar Jan 22 '21 12:01 arapulido

code from Master solve this issue too. You should be able to deploy the datadog-agent suite as you wanted with the next datadog-operator release.

clamoriniere avatar Jan 29 '21 22:01 clamoriniere

Something similar is happening in 0.5.0.rc2. I created the following agent description:

apiVersion: datadoghq.com/v1alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  credentials:
    apiSecret:
      secretName: datadog-secret
      keyName: api-key
  agent:
    image:
      name: "datadog/agent:latest"

And it fails again because of the token secret:

{"level":"INFO","ts":"2021-02-12T10:25:21Z","logger":"DatadogMetricForwarders","msg":"Getting Datadog credentials","CustomResource.Namespace":"default","CustomResource.Name":"datadog"}
{"level":"INFO","ts":"2021-02-12T10:25:21Z","logger":"DatadogMetricForwarders","msg":"Got Datadog Site","CustomResource.Namespace":"default","CustomResource.Name":"datadog","site":"https://api.datadoghq.com"}
{"level":"ERROR","ts":"2021-02-12T10:25:21Z","logger":"DatadogMetricForwarders","msg":"cannot get Datadog credentials,  will retry later...","CustomResource.Namespace":"default","CustomResource.Name":"datadog","error":"Secret \"datadog\" not found"}

arapulido avatar Feb 12 '21 10:02 arapulido

Same issue for me as well

bheemreddy181 avatar May 04 '22 20:05 bheemreddy181

Same issue for me as well

jinft-kr avatar May 15 '23 07:05 jinft-kr

@leejinlee-kr, thanks for reporting the issue! Could you please share the manifest, error log and version of the Operator you are using?

levan-m avatar May 18 '23 15:05 levan-m