Aptly DB deadlock when running in container as PID 1

Open iofq opened this issue 1 year ago • 0 comments

Aptly stores and uses a "worker PID" in the DB to check for running background processes and determine whether a repo is still "locked" by a bg process or is safe for access by another process. When running in a container using api serve command, this PID is set to 1. If the container restarts, the new Aptly will see PID 1 as the owner, check for a running PID 1 (which exists, both the old and new Aptly are PID 1), and maintain that there's a lock on that repo. However, the thread running the update process is long gone, since the container was restarted. Thus, the lock is held forever.

Context

Without this change, Aptly becomes kinda unstable as an API-only tool running in a container

Possible Implementation

I think aptly should be doing better cleanup of this worker pid on SIGTERM. I debugged this to https://github.com/aptly-dev/aptly/blob/master/api/mirror.go#L396 which looks like it does what we want, it just does not run on context cancel and context cancel does not catch SIGTERM.

Your Environment

To reproduce:

In a container: aptly api serve
curl -X POST -H "Content-Type: application/json" --data '{"Name": "ubu", "Distribution": "jammy", "ArchiveURL": "http://us.archive.ubuntu.com/ubuntu"}' localhost:8080/api/mirrors?_async=1
curl -X PUT localhost:8080/api/mirrors/ubu2?_async=1
Restart container while PUT/update is running
curl -X DELETE localhost:8080/api/mirrors/ubu2?_async=1
observe deadlock as it looks for the old PID 1 to complete

Sep 02 '24 22:09 iofq