datadog-operator icon indicating copy to clipboard operation
datadog-operator copied to clipboard

cri check fails when running on docker

Open ppennanen opened this issue 5 years ago • 2 comments

Describe what happened:

datadog-agent created by datadog-operator running on docker logs errors like this:

2020-04-06 12:05:53 UTC | CORE | ERROR | (pkg/collector/runner/runner.go:292 in work) | Error running check cri: temporary failure in criutil, will retry later: try delay not elapsed yet
2020-04-06 12:05:58 UTC | CORE | WARN | (pkg/collector/python/datadog_agent.go:118 in LogMessage) | kubelet:d884b5186b651429 | (kubelet.py:429) | GET on kubelet s `/stats/summary` failed: 403 Client Error: Forbidden for url: https://<REDACTED>:10250/stats/summary?verbose=True
2020-04-06 12:06:08 UTC | CORE | WARN | (pkg/collector/corechecks/checkbase.go:165 in Warnf) | Error initialising check: temporary failure in criutil, will retry later: try delay not elapsed yet

This looks like the same issue as criutil failure that was resolved by Do not enable the cri check when running on a docker setup.

The agent has the environment variable

      DD_CRI_SOCKET_PATH:                     /host/var/run/docker.sock

Describe what you expected:

That the agent would not try to use the docker socket as a CRI socket.

Steps to reproduce the issue:

  • Launch a GKE cluster with Ubuntu + Docker.
  • Deploy DataDog operator
  • Deploy an agent CR:
apiVersion: datadoghq.com/v1alpha1
kind: DatadogAgent
metadata:
  name: datadog-agent
  namespace: datadog
spec:
  credentials:
    apiKeyExistingSecret: dd-api-key
  agent:
    config:
      tolerations:
      - operator: Exists
    apm:
      enabled: true
    logs:
      enabled: true
    image:
      name: "datadog/agent:7.18.1"
    process:
      enabled: true
    systemProbe:
      enabled: true

Additional environment details (Operating System, Cloud provider, etc):

GKE. Ubuntu with Docker.

ppennanen avatar Apr 06 '20 12:04 ppennanen

To silence the error add the environment value DD_CRI_SOCKET_PATH set to null:

apiVersion: datadoghq.com/v1alpha1
kind: DatadogAgent
metadata:
  name: datadog-agent
  namespace: datadog
spec:
  credentials:
    apiKeyExistingSecret: dd-api-key
  agent:
    config:
      env:
      - name: DD_CRI_SOCKET_PATH
        value: null 
      tolerations:
      - operator: Exists
    apm:
      enabled: true
    log:
      enabled: true
    image:
      name: "datadog/agent:7.18.1"
    process:
      enabled: true
    systemProbe:
      enabled: true

ppennanen avatar Apr 07 '20 09:04 ppennanen

Hi @ppennanen

Thank you for reporting this and for the workaround that you suggested (by setting DD_CRI_SOCKET_PATH to null) We're aware of this issue and we're working on a fix We also saw the kubelet check warning in the logs you've shared and we fixed it. The two fixes will be included in the next datadog-operator release.

Thanks!

ahmed-mez avatar Apr 07 '20 10:04 ahmed-mez