snapstate: Check the status of refresh-pending snaps
Every time snapd tries to refresh a snap and fails, sending to the user a message to close it, will monitor that snap to detect when it is closed, and issue a new refresh command in that precise moment.
Thanks for helping us make a better snapd! Have you signed the license agreement and read the contribution guide?
The current code uses polling to detect when the process has ended, but it is only used while there is a pending refresh; when there are no snap refreshes blocked by a running process, it doesn't do polling, but waits to receive data from a Golang Channel.
I tried to use "wait4" ("waitpid" seems to not be available) to avoid polling, but since the snap's processes aren't a child of snapd, the call returns directly.
I could use inotify to detect when the cgroup files disappear without using polling, but that requires an external library because it isn't supported in "naive go"... is that OK?
Finally, to re-launch the refresh process I'm currently calling the shell "snap refresh XXXXX". I preferred to do it that way because I'm still not very familiar with the code. Also, since it is being called from a thread, I'm afraid of race conditions. Anyway, I'm checking the "doSnapAction" call.
doSnapAction isn't exported...
@pedronis Can you check again this?
@mvo5 I did the migration to a no-polling code using fsnotify (https://pkg.go.dev/gopkg.in/fsnotify/fsnotify.v1); but if you dislike that module (it's quite big), it is possible to just copy the two files of "inotify" (https://pkg.go.dev/k8s.io/utils/inotify) and have everything integrated.
Ok, I created https://github.com/snapcore/snapd/pull/12231 , which contains only the Snap monitoring code, as requested. That should simplify the revision process. When that MR is approved and merged, I will include the second part.
@jhenstridge @mvo5 I'm keeping a "full code" version here with the changes done in the "monitoring MR" (#12280) and the piece of code that makes use of it to trigger a refresh. The important changes here are only the ones at autorefresh.go (and, of course, autorefresh_test.go).
When #12280 is merged, then there will only remain to add these changes, so maybe it's a good idea to review them in parallel...
I keep this to be able to build a snap and have it running in my system.
Closing this because all the work is being done in other branches.