Sensible default ignores
Have you checked borgbackup docs, FAQ, and open Github issues?
Yes
Is this a BUG / ISSUE report or a QUESTION?
REQUEST
Describe the problem you're observing.
Can you reproduce the problem?
When I run borg on a new machine, I first have to copy over a set of ignores so the backup does not become a whole mess (it once pulled some infinite data stream from some /dev-descriptor into the repo...)
Suggestion
A set of default ignores that can easily be overriden, which ignore common system directories that should not be backed up, such as /dev and /tmp.
I don't think this should be implemented due to these reasons:
- borg's behaviour should not be magic (it should not ignore stuff or do stuff that is not obvious)
- what should be ignored and what not depends on the OS, potentially even on the specific system and on the use case (e.g. i like to bind mount / to /rootfs and then I backup this, including
/rootfs/dev/) - as seen in the previous item, the exclude paths might depend on the mountpoint and that is not necessarily
/ - if we would build in a default exclude list, there would be also a need to make borg not use it. if you have magic, you also need anti-magic. in the end, it is easier to not have magic.
Well, maybe this could be opt-in with a flag such as --ignore-system-files or whatever, which simply uses sensible heuristics.
Hmm, a slightly different idea: how about if you just use --one-file-system (or short -x) for backing up /?
You'll need a borg create call for each filesystem then, but if that is just e.g. / and /home, that separation makes sense anyway.
I guess that makes sense, since it will also exclude /tmp, /run and the like then? I think I was not aware of the flag. Maybe warrants a FAQ entry.
Though there are still standards like XDG_CACHE_HOME, maybe this can be automatically excluded with --exclude-caches, too, though?
Well, --one-file-system is on the borg create help page / man page / builtin help.
--exclude-caches is documented to only exclude dirs containing a CACHEDIR.TAG file, so that would change the behaviour. while we could to that in a breaking release (like the upcoming borg2), guess it might be better to just add another option like --exclude-xdg-cache-home?
OTOH, if that path is in that env var, you could just add $XDG_CACHE_HOME to your exclude list in your borg script and we could save adding that option and keep the docs shorter.
Just to put a data point. Using borg to backup my home directory on Linux Mint 21 (based on Ubuntu 22.04)
- ~/.cache contains roughly 500 folders/subfolders with a total of 1.6 GB of data.
- Inside ~.cache: borg and fontconfig are the only folders that use the CACHEDIR.TAG (which cover a total of 20 MB)
- On my system, most of the CACHEDIR.TAG s (several hundred) are used outside the home directory (e.g. in /timeshift/*)
On this system, $XDG_CACHE_HOME is not set. :-( (Debian based distribution)
I believe the option --exclude-caches is clearly documented, but the name is unfortunate, since it can easily be misunderstood.
Perhaps --exclude-tagged-caches would be a more appropriate name for this option?
By the way, there is also the experimental --no-cache-sync which adds to the confusion. Is this only about the borg cache i.e. ~.cache/borg ?
Most users which backup their home directory and which use --exclude-caches are likely to desire a variant of: --exclude /home/*/.cache too.
However, this is easy to miss. Perhaps there could be an explicit hint in the documentation?
Just thinking loud: Could the --exclude-caches be extended by keywords?
Something like --exclude-caches TYPE where TYPE could be "borgcaches", "taggedchaches" "allcaches" which are superseeding each other. "allcaches" could include */.cache and $XDG_CACHE_HOME. If no TYPE is provided, borg could fall back to the current behaviour.