cri check fails when running on docker
Describe what happened:
datadog-agent created by datadog-operator running on docker logs errors like this:
2020-04-06 12:05:53 UTC | CORE | ERROR | (pkg/collector/runner/runner.go:292 in work) | Error running check cri: temporary failure in criutil, will retry later: try delay not elapsed yet
2020-04-06 12:05:58 UTC | CORE | WARN | (pkg/collector/python/datadog_agent.go:118 in LogMessage) | kubelet:d884b5186b651429 | (kubelet.py:429) | GET on kubelet s `/stats/summary` failed: 403 Client Error: Forbidden for url: https://<REDACTED>:10250/stats/summary?verbose=True
2020-04-06 12:06:08 UTC | CORE | WARN | (pkg/collector/corechecks/checkbase.go:165 in Warnf) | Error initialising check: temporary failure in criutil, will retry later: try delay not elapsed yet
This looks like the same issue as criutil failure that was resolved by Do not enable the cri check when running on a docker setup.
The agent has the environment variable
DD_CRI_SOCKET_PATH: /host/var/run/docker.sock
Describe what you expected:
That the agent would not try to use the docker socket as a CRI socket.
Steps to reproduce the issue:
- Launch a GKE cluster with Ubuntu + Docker.
- Deploy DataDog operator
- Deploy an agent CR:
apiVersion: datadoghq.com/v1alpha1
kind: DatadogAgent
metadata:
name: datadog-agent
namespace: datadog
spec:
credentials:
apiKeyExistingSecret: dd-api-key
agent:
config:
tolerations:
- operator: Exists
apm:
enabled: true
logs:
enabled: true
image:
name: "datadog/agent:7.18.1"
process:
enabled: true
systemProbe:
enabled: true
Additional environment details (Operating System, Cloud provider, etc):
GKE. Ubuntu with Docker.
To silence the error add the environment value DD_CRI_SOCKET_PATH set to null:
apiVersion: datadoghq.com/v1alpha1
kind: DatadogAgent
metadata:
name: datadog-agent
namespace: datadog
spec:
credentials:
apiKeyExistingSecret: dd-api-key
agent:
config:
env:
- name: DD_CRI_SOCKET_PATH
value: null
tolerations:
- operator: Exists
apm:
enabled: true
log:
enabled: true
image:
name: "datadog/agent:7.18.1"
process:
enabled: true
systemProbe:
enabled: true
Hi @ppennanen
Thank you for reporting this and for the workaround that you suggested (by setting DD_CRI_SOCKET_PATH to null)
We're aware of this issue and we're working on a fix
We also saw the kubelet check warning in the logs you've shared and we fixed it.
The two fixes will be included in the next datadog-operator release.
Thanks!