Cloudwatch agent keep files open
My company started using Cloudwatch Agent in an attempt to "shadow" an application log and send it over to Cloudwatch Logs. It does work fine until my application eventually restarts (and thus has to FileClose/FileOpen the log file again). After the restart, any further attempt to write/append to the log file results in Win Error IO 32 "File is opened by another application".
OS is Windows Server 2019 Application language is Autohotkey and Python.
It is double-checked that cwagent is the application "holding" the log file opened. Just by stopping/restarting the service it goes back to work.
Timeline:
1- Application create/appends to log file 2- cwagent is started 3- Application keeps writing to log file ok 4- Application keeps writing to log file ok 5- Logs go to Cloudwatch logs just ok -- until here it works fine 6- Our application restarts 7- Any further attempt to write fails because file is currently opened/in use by cwagent -- fail 8- We stop/restart the cwagent 9- Application is once again able to do logging -- works fine until the next restart
Hi @bpaeduardo, thanks for reporting the issues. Based on my understand about tailing process of CW, it seems the tailing process will follow and monitor changes from the log files based on the default setting. Therefore, it still monitors even if the application stops working. I will discuss this issue with my teammates and provide the solution ASAP. Before that, can you help me in answering these questions though?
- What CWAgent's version are you using and can you show the sample config of it?
- How can I reproduce the issue? Maybe you can help me in providing a simple python script to reproduce it if it is good for you
Hey, I am also facing similar issue,
Using CW agent to stream a single log file from win OS, the moment the agent opens the file for streaming the app cannot write new logs to the file since it is 'locked'
error: The following error occurred:Cannot open file: "filepath"
We have also experienced a file lock issue where the agent is locking a log file our application manages, it's impacting deletes of the file. Our application is able to write to the log file, but we cannot delete it. This adds additional operational automation to manage the service.
Agent Version: 1.247355.0b252062 OS Version: Windows Server 2019
This is reproducing in my environment as well. We have it monitoring log files for a long-running backup script (running in python) that opens and writes the files for a very long period of time (up to 30-40 minutes).
What seems to be true is that the process holds the files open and does not close them, I've watched this across 24 hours+ of operations on the script, and it's following them but never unlocking them. Additionally, for each of these log files it seems the agent has a significant number of handles open (16 for a single log).
The behavior is expected with CWA since the file handle is still open on those log files, the Windows OS won’t allow another process to delete the file. However, let's me deep dive into the issues before able to offer any work around for you.
This is reproducing in my environment as well. We have it monitoring log files for a long-running backup script (running in python) that opens and writes the files for a very long period of time (up to 30-40 minutes).
What seems to be true is that the process holds the files open and does not close them, I've watched this across 24 hours+ of operations on the script, and it's following them but never unlocking them. Additionally, for each of these log files it seems the agent has a significant number of handles open (16 for a single log).
Correction - this is not reproducing because my application is not utilizing the same file and therefore is not encountering a lock on the file. Please disregard.
Hi, In my on-premise windows environment I was reproduce this issue for powershell script log file. My PS script is runing by Task Scheduler, doing his job and add records to log file(Add-Content). If CWA process runing, my script cannot write to their log file, so my temporary bypass is a stop CWA (before runing my PS sript) and start them after.
CWA Version: 1.247357.0-24-g32d78d1-nightly-build PSVersion 5.1.22621.963
WindowsProductName WindowsVersion OsHardwareAbstractionLayerVersion
Windows Server 2019 Standard 1809
Change plan for workaround. Will try using another cmdlet in PS script. Instead Add-Content using Out-File $LogFilename -Append.
Beacause Add-Content need exclusive lock file.
I think the issue here is, we the 'end user' are having to work around a design decision in how the agent monitors for file changes. I think it should be considered either to NOT lock a file to read it, or consider using an event based design (granted this is more work) to monitor change.
Another effect of this design, you need to stop the agent to roll the log file, so any sort of log management tooling outside of the cloud watch agent, at least on Windows will have issues. Please educate me if I am wrong, but this is our experience.
Linux offers a variety of ways to monitor for a file change. https://www.infoq.com/articles/inotify-linux-file-system-event-monitoring/
Windows offers a file system watcher.
https://learn.microsoft.com/en-us/dotnet/api/system.io.filesystemwatcher?view=net-7.0