EKS Fargate agent:7-jmx constantly log warning
Output of the info page (if this is a bug)
❯ agent status
2022-02-17 10:07:38 UTC | CORE | WARN | (pkg/util/log/log.go:640 in func1) | Deactivating Autoconfig will disable most components. It's recommended to use autoconfig_exclude_features and autoconfig_include_features to activate/deactivate features selectively
Getting the status from the agent.
===============
Agent (v7.33.1)
===============
Status date: 2022-02-17 10:07:38.692 UTC (1645092458692)
Agent start: 2022-02-17 06:32:02.888 UTC (1645079522888)
Pid: 380
Go Version: go1.16.7
Python Version: 3.8.11
Build arch: amd64
Agent flavor: agent
Check Runners: 4
Log Level: warn
Paths
=====
Config File: /etc/datadog-agent/datadog.yaml
conf.d: /etc/datadog-agent/conf.d
checks.d: /etc/datadog-agent/checks.d
Clocks
======
NTP offset: -1.931ms
System time: 2022-02-17 10:07:38.692 UTC (1645092458692)
Host Info
=========
bootTime: 2022-02-17 06:27:55 UTC (1645079275000)
kernelArch: x86_64
kernelVersion: 4.14.262-200.489.amzn2.x86_64
os: linux
platform: ubuntu
platformFamily: debian
platformVersion: 21.10
procs: 11
uptime: 4m32s
Hostnames
=========
host_aliases: [fargate-ip-XXX-XXX-XXX-XXX.eu-west-1.compute.internal]
socket-fqdn: xxx-xxxxxx-xxx-server-queue-worker-5d79f6b67d-5cplq
socket-hostname: xxx-xxxxxx-xxx-server-queue-worker-5d79f6b67d-5cplq
host tags:
apikey:00xxx
aws_account_name:xxxxxxxxxxxxxxx_des
datacenter:aws
env:des
platform:eks-fargate
product:xxx
terraform:true
hostname provider:
unused hostname providers:
configuration/environment: hostname is empty
Metadata
========
agent_version: 7.33.1
config_apm_dd_url:
config_dd_url:
config_logs_dd_url:
config_logs_socks5_proxy_address:
config_no_proxy: []
config_process_dd_url:
config_proxy_http:
config_proxy_https:
config_site:
feature_apm_enabled: true
feature_cspm_enabled: false
feature_cws_enabled: false
feature_logs_enabled: false
feature_networks_enabled: false
feature_process_enabled: false
flavor: agent
install_method_installer_version: docker
install_method_tool: docker
install_method_tool_version: docker
=========
Collector
=========
Running Checks
==============
container
---------
Instance ID: container [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/container.d/conf.yaml.default
Total Runs: 862
Metric Samples: Last Run: 0, Total: 0
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2022-02-17 10:07:28 UTC (1645092448000)
Last Successful Execution Date : 2022-02-17 10:07:28 UTC (1645092448000)
cpu
---
Instance ID: cpu [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/cpu.d/conf.yaml.default
Total Runs: 862
Metric Samples: Last Run: 9, Total: 7,751
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2022-02-17 10:07:35 UTC (1645092455000)
Last Successful Execution Date : 2022-02-17 10:07:35 UTC (1645092455000)
disk (4.5.1)
------------
Instance ID: disk:a1cfeb1bef22319f [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/disk.d/conf.yaml.default
Total Runs: 861
Metric Samples: Last Run: 200, Total: 172,200
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 14ms
Last Execution Date : 2022-02-17 10:07:27 UTC (1645092447000)
Last Successful Execution Date : 2022-02-17 10:07:27 UTC (1645092447000)
eks_fargate (2.1.0)
-------------------
Instance ID: eks_fargate:d734b1956a31b015 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/eks_fargate.d/conf.yaml.default
Total Runs: 862
Metric Samples: Last Run: 3, Total: 2,586
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 15ms
Last Execution Date : 2022-02-17 10:07:34 UTC (1645092454000)
Last Successful Execution Date : 2022-02-17 10:07:34 UTC (1645092454000)
file_handle
-----------
Instance ID: file_handle [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/file_handle.d/conf.yaml.default
Total Runs: 861
Metric Samples: Last Run: 5, Total: 4,305
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2022-02-17 10:07:26 UTC (1645092446000)
Last Successful Execution Date : 2022-02-17 10:07:26 UTC (1645092446000)
io
--
Instance ID: io [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/io.d/conf.yaml.default
Total Runs: 862
Metric Samples: Last Run: 52, Total: 44,788
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2022-02-17 10:07:33 UTC (1645092453000)
Last Successful Execution Date : 2022-02-17 10:07:33 UTC (1645092453000)
kubelet (7.1.0)
---------------
Instance ID: kubelet:5avc64g118c18a4 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/kubelet.d/conf.yaml.default
Total Runs: 647
Metric Samples: Last Run: 691, Total: 437,690
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 4, Total: 2,588
Average Execution Time : 174ms
Last Execution Date : 2022-02-17 10:07:33 UTC (1645092453000)
Last Successful Execution Date : 2022-02-17 10:07:33 UTC (1645092453000)
load
----
Instance ID: load [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/load.d/conf.yaml.default
Total Runs: 861
Metric Samples: Last Run: 6, Total: 5,166
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2022-02-17 10:07:25 UTC (1645092445000)
Last Successful Execution Date : 2022-02-17 10:07:25 UTC (1645092445000)
memory
------
Instance ID: memory [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/memory.d/conf.yaml.default
Total Runs: 862
Metric Samples: Last Run: 18, Total: 15,516
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2022-02-17 10:07:32 UTC (1645092452000)
Last Successful Execution Date : 2022-02-17 10:07:32 UTC (1645092452000)
ntp
---
Instance ID: ntp:d752b4386b561429 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/ntp.d/conf.yaml.default
Total Runs: 15
Metric Samples: Last Run: 1, Total: 15
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 1, Total: 15
Average Execution Time : 400ms
Last Execution Date : 2022-02-17 10:02:13 UTC (1645092133000)
Last Successful Execution Date : 2022-02-17 10:02:13 UTC (1645092133000)
uptime
------
Instance ID: uptime [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/uptime.d/conf.yaml.default
Total Runs: 861
Metric Samples: Last Run: 1, Total: 861
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2022-02-17 10:07:24 UTC (1645092444000)
Last Successful Execution Date : 2022-02-17 10:07:24 UTC (1645092444000)
========
JMXFetch
========
Information
==================
Initialized checks
==================
no checks
Failed checks
=============
no checks
=========
Forwarder
=========
Transactions
============
Cluster: 0
ClusterRole: 0
ClusterRoleBinding: 0
CronJob: 0
DaemonSet: 0
Deployment: 0
Dropped: 0
HighPriorityQueueFull: 0
Job: 0
Node: 0
PersistentVolume: 0
PersistentVolumeClaim: 0
Pod: 0
ReplicaSet: 0
Requeued: 0
Retried: 0
RetryQueueSize: 0
Role: 0
RoleBinding: 0
Service: 0
ServiceAccount: 0
StatefulSet: 0
Transaction Successes
=====================
Total number: 1817
Successes By Endpoint:
check_run_v1: 862
intake: 71
metadata_v1: 22
series_v1: 862
On-disk storage
===============
On-disk storage is disabled. Configure `forwarder_storage_max_size_in_bytes` to enable it.
API Keys status
===============
API key ending with 00xxx: API Key valid
==========
Endpoints
==========
https://app.datadoghq.eu - API Key ending with:
- 00xxx
==========
Logs Agent
==========
Logs Agent is not running
=========
APM Agent
=========
Status: Running
Pid: 378
Uptime: 12935 seconds
Mem alloc: 8,637,848 bytes
Hostname:
Receiver: 0.0.0.0:8126
Endpoints:
https://trace.agent.datadoghq.eu
Receiver (previous minute)
==========================
No traces received in the previous minute.
Default priority sampling rate: 100.0%
Writer (previous minute)
========================
Traces: 0 payloads, 0 traces, 0 events, 0 bytes
Stats: 0 payloads, 0 stats buckets, 0 bytes
=========
Aggregator
=========
Checks Metric Sample: 707,708
Dogstatsd Metric Sample: 104,296
Number Of Flushes: 862
Series Flushed: 643,921
Service Check: 11,019
Service Checks Flushed: 11,879
=========
DogStatsD
=========
Event Packets: 0
Event Parse Errors: 0
Metric Packets: 104,295
Metric Parse Errors: 0
Service Check Packets: 0
Service Check Parse Errors: 0
Udp Bytes: 7,928,147
Udp Packet Reading Errors: 0
Udp Packets: 74,582
Uds Bytes: 0
Uds Origin Detection Errors: 0
Uds Packet Reading Errors: 0
Uds Packets: 0
Unterminated Metric Errors: 0
=============
Autodiscovery
=============
Enabled Features
================
eksfargate
kubernetes
Configuration Errors
====================
des-xxx/xxx-xxxxxxxx-xxxx-server-queue-worker-5d79f6b67d-5cplq
-----------------------------------------------------------------------
annotation ad.datadoghq.com/queue_worker.logs is invalid: queue_worker doesn't match a container identifier [datadog-agent queue-worker]
Describe what happened: We have the following warning constantly:
2022-02-17 09:34:25 UTC | CORE | WARN | (pkg/security/log/logger.go:112 in Warnf) | Collector not found for container: &{{container xxxxxxxxxxxx} {xxxxx-xxxxxx-rabbit map[] map[]} map[AWS_DEFAULT_REGION:eu-west-1 AWS_REGION:eu-west-1 AWS_ROLE_ARN:arn:aws:iam::xxxxxx:role/k8s-xxx-datadog-role AWS_WEB_IDENTITY_TOKEN_FILE:/var/run/secrets/eks.amazonaws.com/serviceaccount/token DD_ENV: DD_LOGS_INJECTION:true DD_SERVICE:xxxxxx-xxxxx-xxx DD_TRACE_CLI_ENABLED:true DD_VERSION: SERVICE_NAME:xxxxx-xxx-queue-worker STAKATER_DATADOG_SSM_SECRET:***********************************000000 STAKATER_XXXX_COMMON_SECRET:***********************************000000 STAKATER_XXXX_XXXXXXXX_ENGINE_SERVER_DATADOG_CONFIGMAP:***********************************000000 STAKATER_XXXX_XXXXXXXX_ENGINE_SERVER_ENGINE_SCM_SECRET:***********************************000000 WORKER_QUEUE_NAME:xxxxx-xxxx-reader WORKER_QUEUE_PARAMETERS:--tries=4 --timeout=240 --sleep=2] {docker.io/xxxxx/xxxx-xxxxx-server@sha256:xxxxxxxxxxxxx docker.io/xxxxx/xxxxx-xxxx-server:5.11.5-aws docker.io/xxxxx/xxxxx-xxx-server xxxxxx-xxxx-server 5.11.5-aws} map[] 0 [] containerd {true 2022-02-17 06:33:52 +0000 UTC 0001-01-01 00:00:00 +0000 UTC}}, metrics will ne missing
2022-02-17 09:34:25 UTC | CORE | WARN | (pkg/security/log/logger.go:112 in Warnf) | Collector not found for container: &{{container xxxxxxxxxxxxx} {datadog-agent map[] map[]} map[AWS_DEFAULT_REGION:eu-west-1 AWS_REGION:eu-west-1 AWS_ROLE_ARN:arn:aws:iam::xxxxxxxxx:role/k8s-xxx-datadog-role AWS_WEB_IDENTITY_TOKEN_FILE:/var/run/secrets/eks.amazonaws.com/serviceaccount/token DD_APM_ENABLED:true DD_EKS_FARGATE:true DD_KUBERNETES_KUBELET_NODENAME: DD_LOG_LEVEL:warn DD_SITE:datadoghq.eu DD_TAGS:env:des product:xxx platform:eks-fargate datacenter:aws aws_account_name:xxxxxx_xxx terraform:true apikey:xxxxx] {docker.io/datadog/agent@sha256:xxxxxxxxxxxxx docker.io/datadog/agent:7-jmx docker.io/datadog/agent agent 7-jmx} map[] 0 [] containerd {true 2022-02-17 06:32:57 +0000 UTC 0001-01-01 00:00:00 +0000 UTC}}, metrics will ne missing
This repeats every 15 seconds.
Describe what you expected:
In notes of this doc(https://docs.datadoghq.com/integrations/eks_fargate/#metrics-collection) we can see: Container metrics are not available in Fargate because the cgroups volume from the host can’t be mounted into the Agent. The [Live Containers](https://app.datadoghq.com/containers) view reports 0 for CPU and Memory..
If this DD_EKS_FARGATE:true is set, why is the container metrics checking every 15 seconds? Wouldn't it be appropriate to deactivate this component and not generate warnings records?
Additional environment details (Operating System, Cloud provider, etc):
- EKS version: v1.21.5-eks-bc4871b
- Kubelet version: v1.21.5-eks-9017834
- Fargate Kubelet version: v1.21.2-eks-06eac09
same here.
EKS version: v1.21.5-eks-bc4871b
Datadog Agent Version: v7.33.1
This issue happens v7.33.0~, but v7.32.4 not. The following events will occur in v7.33.0~.
- Metrics don't contain tag information
- A lot of WARN logs run below
Collector not found for container
Having absolutely the same issue. No idea what the reason is :crying_cat_face:
I have had this issue. I am not sure what can be the root cause but I used "Annotations v1 (for Datadog Agent < v7.36)" format instead of "Annotations v2 (for Datadog Agent v7.36+)" even though my version was 7.38.2. I hope this feedback will be a good clue to datadog staff and newbies as well.
annotations:
ad.datadoghq.com/nginx.check_names: '["nginx"]'
ad.datadoghq.com/nginx.init_configs: '[{}]'
ad.datadoghq.com/nginx.instances: |
[
{
"nginx_status_url":"http://%%host%%:80/nginx_status/"
}
]