Deal with issues when the disk is full
Check: https://github.com/bluerobotics/BlueOS/issues/2327, https://github.com/bluerobotics/BlueOS/issues/2323, https://github.com/bluerobotics/BlueOS/issues/2326, https://github.com/bluerobotics/BlueOS/issues/1015
The docker is able to start, but everything after that just results in unstable behavior. Some points that you suggested are already available as issues, others are relevant to recover the system.
- [x] We should delete old logs when doing the rotation and noticing the the disk space is almost full.
- [ ] We should stop logging if the disk space is almost full.
- This conflicts with rotation configuration in loguru
- [ ] We should clean up old dockers that are not being used.
- [ ] We should clean up old docker artifacts that are not being used.
- [ ] We should allow user to delete all unused docker images.
- [ ] We should warn the user though cockpit that the companion computer is almost full in disk.
- [ ] We should erase older tlog or bin files if the disk is almost full.
- #1257
- [ ] We should warn the user though BlueOS header that the disk is almost full and in critical state.
- [ ] We may not allow the user to arm the vehicle if the disk is almost full.
- [ ] We may do some of this steps automatically to try to recover the system once it starts.
- [ ] We may need a page like filelight on BlueOS to help identify the root of such problems.
- [x] We should limit journald max size
Originally posted by @patrickelectric in https://github.com/bluerobotics/BlueOS/issues/2325#issuecomment-1902117968
I know it might be trickier to manage the installation, but another valid strategy is to put /var in another partition.
#2359
I installed one extension [Nortek Nucleus], that grew the docker log to 18GB in about a week, maybe kraken could add a limit in the size of the docker logs:
https://docs.docker.com/config/containers/logging/configure/#configure-the-default-logging-driver
--log-opt max-size=100m
I've just freed ~12 gb here by doing:
sudo docker system prune -a # ~9 gb
sudo journalctl --vacuum-time=2d # ~1 gb
sudo apt-get clean # ~1 gb
@joaoantoniocardoso how we end up with 1GB of unnecessary stuff in our apt ?
I've just freed ~12 gb here by doing:
sudo docker system prune -a # ~9 gb sudo journalctl --vacuum-time=2d # ~1 gb sudo apt-get clean # ~1 gb
The docker prune is specially important, as there are A LOT of leftover overlays hanging there forever. It would be good to do it automatically, or at least putting a button on BlueOS to do that.
@joaoantoniocardoso how we end up with 1GB of unnecessary stuff in our apt ?
Maybe I've installed many things on mine, but it'd be good to check how it is in a fresh install.
We are discussing this subject a bit in our project, as we run robots 24/7, and they could be running for several days, maybe even weeks without restarting/power cycling.
I've not done very thorough digging, but I think it would be very nice to have some kind of parameter (maybe even user facing), that would permit you to set a target age for tlog files. If I set 7 days, then any tlog files older than 7 days would be auto purged (no sure how often that should run). Maybe a bit out of scope for this issue, but should help nonetheless.
I can make a separate issue if that is better.
EDIT:
Also, before this can even happen, we would need mavlinkrouter to somehow auto split/rotate files every 200mb or 12 hours maybe.
Hi @goasChris, thanks for your input. Indeed the tlogs are also important for us to track. Adding on it, the tlogs are already in the list. Let us know if you have anything that is not being tracked at the moment.
About tlogs: https://github.com/mavlink-router/mavlink-router/issues/426
Some things that take a lot of space and can be removed:
- Use
docker system prune -ato delete unnecessary overlays in/var/lib/docker/overlay2. However, we need somehow to ensure that the factory image is tagged, because right now it is not and got deleted with other overlays - Limit or remove .tlog files as they can accumulate and take up significant space.
- Remove all unused images in BlueOS version, except for the factory image and the current running image.