If only one EMF metric is sent before shutdown, metric is lost
Describe the bug
When a single EMF metric is sent, and the agent is immediately shutdown after that metric is sent, the metric is lost.
The logs look like:
2023-06-20T04:26:50Z I! [inputs.socket_listener] Listening on tcp://[::]:25888
--
2023-06-20T04:26:54Z I! Profiler is stopped during shutdown
2023-06-20T04:26:54Z I! [agent] Hang on, flushing any cached metrics before shutdown
2023-06-20T04:26:54Z I! [agent] Stopping running outputs
2023-06-20T04:26:54Z W! [outputs.cloudwatchlogs] Retried 0 time, going to sleep 180.286823ms before retrying.
2023-06-20T04:26:54Z E! [outputs.cloudwatchlogs] Stop requested after 0 retries to /ecs/NC-FeatureCalculator/EmbeddedMetrics/aca07567d7334356924db2cf7366ecac failed for PutLogEvents, request dropped.
The log stream /ecs/NC-FeatureCalculator/EmbeddedMetrics/aca07567d7334356924db2cf7366ecac is created.
What I believe is happening is that:
- The agent attempts to put the log
- It fails because the log stream does not yet exist (this is the first/only event for my ECS task)
- The agent creates the log stream
- The request is dropped without retrying because the agent is shutting down.
If I add a sleep statement after the metric scope is closed, I can see the retry occur and the metric sent to the created log stream group successfully, because it's processed before the shutdown occurs.
Steps to reproduce
Let me know if you need this, and I'll work it out.
What did you expect to see?
The log stream for my EMF metrics to contain an EMF metric, and that metric to be reflected in Cloudwatch logs.
What did you see instead?
The log stream for my EMF metrics exists but is empty, and the errors shown above.
What version did you use?
The current version from https://hub.docker.com/r/amazon/cloudwatch-agent, which appears to be 1.247359.1b252618.
What config did you use?
Config: { "logs": { "metrics_collected": { "emf": {}, } } }
Environment
ECS sidecar
Additional context
NA
This issue was marked stale due to lack of activity.
Still relevant.
Thank you for bringing this issue to our attention.
I will create a ticket in my pm backlog to review if we should add this functionality to the agent.
This issue was marked stale due to lack of activity.
@sethAmazon this is not functionality. This is a bug.
This issue was marked stale due to lack of activity.
Looks like there's still no one left working at AWS to pay attention to this issue 😑