agent icon indicating copy to clipboard operation
agent copied to clipboard

"skipping update of position for a file which does not currently exist" grafana agent consuming lot of memory

Open divyn10 opened this issue 2 years ago • 6 comments

What's wrong?

I have deployed Static mode Kubernetes operator. And I am using this for pushing the pod logs to loki.

Recently I found out that grafana-agent-logs pod is taking a lot of memory and it is also not able to push the logs. image

After checking the logs of the config-reloader container I found some error messages.

level=error ts=2024-03-26T06:33:34.799302016Z caller=runutil.go:100 msg="function failed. Retrying in next tick" err="trigger reload: reload request failed: Post \"http://127.0.0.1:8080/-/reload\": context deadline exceeded"
level=error ts=2024-03-26T06:33:34.799311483Z caller=reloader.go:384 msg="Failed to trigger reload. Retrying." err="trigger reload: reload request failed: Post \"http://127.0.0.1:8080/-/reload\": context deadline exceeded" 
level=error ts=2024-03-26T06:34:04.803619827Z caller=runutil.go:100 msg="function failed. Retrying in next tick" err="trigger reload: reload request failed: Post \"http://127.0.0.1:8080/-/reload\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"  

image

After force deleting this pod the error msg in the config-reloader container is gone and grafana-agent-logs pod’s memory also comes down, it also starts pushing logs to loki.

Can you help me with the reason for the issue and with a better solution, as the current solution is not very optimal

Steps to reproduce

didn't find the reason of this bug

System information

No response

Software version

grafana agent v0.39.0

Configuration

No response

Logs

level=error ts=2024-03-26T06:33:34.799302016Z caller=runutil.go:100 msg="function failed. Retrying in next tick" err="trigger reload: reload request failed: Post \"http://127.0.0.1:8080/-/reload\": context deadline exceeded"
level=error ts=2024-03-26T06:33:34.799311483Z caller=reloader.go:384 msg="Failed to trigger reload. Retrying." err="trigger reload: reload request failed: Post \"http://127.0.0.1:8080/-/reload\": context deadline exceeded" 
level=error ts=2024-03-26T06:34:04.803619827Z caller=runutil.go:100 msg="function failed. Retrying in next tick" err="trigger reload: reload request failed: Post \"http://127.0.0.1:8080/-/reload\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"

divyn10 avatar Mar 27 '24 06:03 divyn10

observed the same issue once again. on checking grafana agent logs getting

ts=2024-04-02T06:49:12.155205805Z caller=tailer.go:207 level=info component=logs logs_config=namespace/kubernetes-pod-logs component=tailer msg="skipping update of position for a file which does not currently exist" path=/var/log/pods/0.log                                          
ts=2024-04-02T06:49:18.380674964Z caller=entrypoint.go:339 level=info msg="reload of config file requested"                         
ts=2024-04-02T06:49:18.988442727Z caller=tailer.go:207 level=info component=logs logs_config=namespace/kubernetes-pod-logs component=tailer msg="skipping update of position for a file which does not currently exist" path=/var/log/pods/6.log 

It seems the issue is same as issue 3884

I have grafana agent version v0.39.0

divyn10 avatar Apr 02 '24 07:04 divyn10

This issue has not had any activity in the past 30 days, so the needs-attention label has been added to it. If the opened issue is a bug, check to see if a newer release fixed your issue. If it is no longer relevant, please feel free to close this issue. The needs-attention label signals to maintainers that something has fallen through the cracks. No action is needed by you; your issue will be kept open and you do not have to respond to this comment. The label will be removed the next time this job runs if there is new activity. Thank you for your contributions!

github-actions[bot] avatar May 10 '24 00:05 github-actions[bot]

Any update on this? This was suppossed to be fixed by #3885 but it's still coming up in grafana/agent:v0.39.1.

Upanshu11 avatar Jul 02 '24 07:07 Upanshu11

This issue has not had any activity in the past 30 days, so the needs-attention label has been added to it. If the opened issue is a bug, check to see if a newer release fixed your issue. If it is no longer relevant, please feel free to close this issue. The needs-attention label signals to maintainers that something has fallen through the cracks. No action is needed by you; your issue will be kept open and you do not have to respond to this comment. The label will be removed the next time this job runs if there is new activity. Thank you for your contributions!

github-actions[bot] avatar Aug 02 '24 00:08 github-actions[bot]

Having the same issue after upgrading from helm chart V5 to V6 (after migrating from boltdb-shipper to a new tsdb store). No idea what the problem is.

sourcehawk avatar Sep 10 '24 13:09 sourcehawk

This issue has not had any activity in the past 30 days, so the needs-attention label has been added to it. If the opened issue is a bug, check to see if a newer release fixed your issue. If it is no longer relevant, please feel free to close this issue. The needs-attention label signals to maintainers that something has fallen through the cracks. No action is needed by you; your issue will be kept open and you do not have to respond to this comment. The label will be removed the next time this job runs if there is new activity. Thank you for your contributions!

github-actions[bot] avatar Oct 11 '24 00:10 github-actions[bot]