systemd kills dma after forking to background and unit terminates
I am using systemd timers to schedule backup runs. Mails sent after backup never arrive, presumably because:
- backup software invokes
mail -
mailinvokesdma(assendmail) -
dmadaemon()izes after writing the queue file, effectively exitingdmafrom the view ofmail -
mailand subsequently backup software exit - systemd sees the backup process exit and then kills the cgroup
-
dmais killed, the queue file stays in place and never gets delivered
I am not sure how to generically fix this. One option would be to have a systemd unit that gets triggered on a socket, running a dma -q. If this socket is active, then dma would terminate instead of trying to deliver itself. Of course this requires installing and activating a systemd unit, but I don't see how we could do it without.
Seeking comments.
- Maybe set
KillModetoprocessornonein the activated unit: https://manpages.debian.org/testing/systemd/systemd.kill.5.en.html - When
dmadaemonizes the delivery, have it support a cgroup change - Add a configuration option for
dma's delivery daemon to ignore or hold off onSIGTERM. - Use
ExecStartPost=which doesn't say anything about killing anything. - Use
Type=tooneshotorexecand/orExitType=cgroupto tell systemd it should wait for the cgroup to finish. There is an argument thatExitType=cgroupshould probably have been made the default when systemd started to manage everything by cgroup. - It looks like the systemd way is that any script the calls
mailshould request non-forking behavior. https://wiki.archlinux.org/title/systemd/Timers#MAILTO Doesdmaknow about the factmailgot-Ssendwait? If it does it could just not daemonize on that option.
For this specific backup program, setting KillMode would be enough. However, this doesn't automatically generalize, and you'd have to set KillMode for every service that might want to send mails and quit. Not a good user experience (POLA).
- Will this create a sub-cgroup or a separate one? I had assumed it would be a sub-cgroup and subject to the group kill.
- We can ignore
SIGTERM, but I think eventually we'll get aSIGKILLfrom systemd. - seems
mailneeds-Ssendwaitnot to run into the same problem. However, anysendmailusually exits the main process when the mail is "safe", i.e. flushed in a queue file (or delivered off-site). That's what dma does. Keeping the process open until the mail is delivered negates having an MTA with a queue.
Unfortunately all systemd service specific modifications don't generalize (1. & 5.) - I would like dma to work like postfix would work on a system, without having to edit all kinds of service files (and likely forgetting one).
I believe a process can be moved into any cgroup you have permission to access. If need be a dma subgroup with group mail access could be created at the system level /sys/fs/cgroup/dma/, similar to /var/spool/dma. I have not actually played with cgroups much. This looks promising, for example as of v2 process can nolonger be in more than 1 cgroup.
https://docs.kernel.org/admin-guide/cgroup-v2.html
On the -Ssendwait workaround, what is "safe" depends on MTA design. I think the user expectation is mail will be delivered. This has a cost of negating the benefits of an mta queue. But this is a cost that the systemd.timer team chose with their tightly coupled process/cgroup lifecycle model or the fact that they have conflated system-services and timer events both as things that run. Which is kind of acknowledged in that Archlinux link about sending mail. Even if they went with more sensible default of ExitType=cgroup they still couldn't shut the unit down until the mail was actually delivered and the daemon closed so they are waiting either way.
This is all just talking thru the design options. Socket activation might be the right move. I guess create a fifo in $SPOOLDIR and write to it and see if the dma.service responds before daemonizing? Maybe just read from the socket and make sure you get a good version response.
I was thinking of using a dma.socket and activating the queue runner via systemd.