[RFE] better diagnostic suggestions on error from flatcar-update
Current situation
At present when flatcar-update fails it just says:
Error: update failed
Impact
Users have to figure out for themselves what caused the failure, and how they might deal with that.
When it's not clear how to do that they might open a bug ticket like #758 causing support load on the Flatcar team.
Ideal future situation
The ideal future situation would be for flatcar-update to be more robust, and effectively deal with the bumps in the road it might encounter.
Failing that (and as a shorter term fix) it would be helpful to direct users towards the logs that might explain their situation e.g.
Error: update failed
Try:
journalctl -u update-engine
to identify the source of the error
Implementation options
- Better error handling and recovery
- Guidance on where to look for causes of problems
Thanks, these are welcome suggestions.
The Try: output could also recommend to set the key argument. A small check to know which of the dev or the production key argument could be needed is looking if /usr/share/update_engine/update-payload-key.key.pem exists which is the case for dev images where the dev key is used, so it could suggest the prod key and vice versa.
I think it's also possible to grep the journal output automatically if we can make assumptions on the output format of update-engine logs.
@pothos Where does it live? Also, is there a doc about this vs update_engine_client?
It's here: https://github.com/flatcar-linux/init/tree/flatcar-master/bin
The tool is mentioned in https://www.flatcar.org/docs/latest/setup/releases/update-strategies/ and https://www.flatcar.org/docs/latest/setup/debug/manual-rollbacks/