Server is not restarted after crash
If the minecraft server process crashes, a manual restart of the container is required to get it back in working order. I would expect that either the minecraft server process is restarted automatically inside the container, or that the entire container goes down (so that it can be restarted by an orchestrator).
Hi, see https://github.com/binhex/arch-minecraftserver/pull/14 for a poc
Another possibility could be adding something to the screen that contains the minecraft process like:
for (( ; ; ))
do
java -Xmx7G -jar minecraft_server.jar nogui
echo "Server closed unexpectedly, restarting in 10 seconds..."
sleep 10
done
ive put in a very basic infinite while loop so the server process will now restart on crash, please pull down the latest image and let me know how it goes.
Thanks for adding that @binhex
I've tested that by pulling the latest image and:
- launching a shell in the container and running
kill ${pid of java process}- the server gracefully shut down in this case
- running stop in the web ui
- running
kill -9
each time the server automatically restarted so it seems like this patch works! Thanks.
I have just realised that this patch unintentionally makes #12 a little worse to deal with. Whenever I want to gracefully shut down, I login to the web-ui and run stop, or run that in-game, wait for it to finish gracefully shutting down and then stop the container.
Now however because it automatically restarts I'm worried there's no way to gracefully shut down as the server will auto-restart before you can stop the container 🤔
I'm not sure if there is a good solution to both problems at the same time. One approach is to run the save command before uncleanly exiting but it's not perfect.
Maybe something like:
- #14 to allow containers to be restarted if the process inside is not responding
- remove the loop to auto-restart the java process, after an alternative to auto-relaunching crashed server is supported
- when a container is stopped it should send a SIGTERM to the Java process. This happens by default when a container is stopped with
docker stop, docker sends a SIGTERM to pid = 1 in the container.- In this repo the dockerfile specifies bash as the CMD so the bash process is PID 1, and it isn't forwarding the signal into the screen session
- I found some more info here
Whenever I want to gracefully shut down, I login to the web-ui and run stop, or run that in-game, wait for it to finish gracefully shutting down and then stop the container.
I shall put in additional code to trap SIGTERM and CTRL+C, this should then handle the case where you want to force a shutdown whilst still permitting the server to restart.
- In this repo the dockerfile specifies bash as the CMD so the bash process is PID 1, and it isn't forwarding the signal into the screen session
This is a different problem and has already been addressed, i am making use of dumb-init to pass signals along, i also have written a script to wait for the process to end which i will include to try and ensure the process is not sent a SIGKILL which should fix https://github.com/binhex/arch-minecraftserver/issues/12, it is a little tricky as there is a lot of wheels in wheels going on here.
Thanks, sounds good. I'm happy to test the changes when you need.
the changes are in, please pull down latest.
@binhex I've pulled the latest image and tried to trigger a graceful shutdown, sadly I wasn't able to. Testing on Unraid.
When I stop the container, the webUI immediately loses connection and the server stops immediately. I've checked the latest.log and the screen.log files and neither have any indication of the server gracefully stopping. Additionally, when logged into the server as a player, I get the error message "Disconnected - Connection lost" instead of "Disconnected - Server closed".
If I shell into the container and do a kill on the pid of the java process, I do get the expected "Server closed" message and the latest.logs file shows the expected server shutting down log lines.
Perhaps instead of running the java process inside a screen session, the infinite loop and the java command can run in a blocking fashion at the end of the start.sh script? I think because the current signal catching logic is inside the screen session it may not be receiving any signal, or the pid1 process is exiting immediately and not waiting for the screen to exit. It might be simpler if the subprocess isn't inside a screen, though that might break the way the webui works.
Maybe also the root of the start.sh script could catch the signals and explicitly pkill java and wait for it.
I recently saw this article which has some interesting information: https://sirikon.me/posts/0009-pid-1-bash-script-docker-container.html
Potentially more powerful version of dumb-init which supports multiple children: https://github.com/linkdd/procfusion