fix(commands): return error if check fails
closes #222
Codecov Report
Attention: Patch coverage is 54.49735% with 86 lines in your changes missing coverage. Please review.
Project coverage is 43.4%. Comparing base (
1e7b4a8) to head (5b98b3b). Report is 82 commits behind head on main.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| crates/core/src/commands/check.rs | 54.9% | 82 Missing :warning: |
| crates/core/src/repository.rs | 42.8% | 4 Missing :warning: |
Additional details and impacted files
| Files with missing lines | Coverage Ξ | |
|---|---|---|
| crates/core/src/error.rs | 14.2% <ΓΈ> (ΓΈ) |
|
| crates/core/src/repository.rs | 34.7% <42.8%> (+0.2%) |
:arrow_up: |
| crates/core/src/commands/check.rs | 55.8% <54.9%> (-7.8%) |
:arrow_down: |
I'll look through it tomorrow. ππ½
After looking through it, I understand better now, what I found confusing about it, back in May. It's really because it introduces a special way to handle errors only for check (via self), although we want to have a unified way of handling errors. And apply that throughout rustic_core, which makes it easier to maintain, than having a special way to handle check errors.
The CheckError in errors.rs is really nice though, and this is probably the approach, we should take all over the code base. Then we might construct them in-place with all the information we need and create a well-formed error message towards the user from it.
For collecting the errors in a method/function we could use lazy_errors or change the signature to something like either:
-
Result<T, Vec<RusticError>> -
struct ResultWithErrors<T> { value: Option<T>, soft_errors: Vec<RusticError>, hard_errors: Vec<RusticError>, } - or return a tuple of
(T, Vec<RusticError>)from a function, in case we want to handle multiple errors - we could also go for making
RusticErrora trait and implement common behaviour on structs for each possible error struct
But this topic is definitely something we should talk about to find a good approach.
In principle I like the ides of lazy_errors - I however think that we want to collect not only errors, but also warnings - or, more general, a kind of ErrorLevel together with the error.
Also, we have two cases:
- situations where we want to collect errors, but continue, i.e. we want to return a result and a error list
- situation where we want to abort on errors, i.e. we want to return only the error list and indicate that things went wrong...
I sketched out an idea, to fix the pain points you mentioned in a more generic way, than introducing special error handling/collection for check only. Could you take a look at https://github.com/rustic-rs/rustic_core/pull/318 ?
I think the main question here is: Should we merge this PR with maybe just small modifications (quick&dirty) to fix the check issues and then overwork the error handling once we have a general error handling framework? Or should we wait for the framework? I have a tendency towards a quick&dirty merge here, actually...
I think the main question here is: Should we merge this PR with maybe just small modifications (quick&dirty) to fix the check issues and then overwork the error handling once we have a general error handling framework? Or should we wait for the framework? I have a tendency towards a quick&dirty merge here, actually...
I would wait, I don't think (no offence really, we are doing a lot here) this should be merged as it is. We are introducing new errors here (CheckError), etc. We should just focus on this error handling topic, after the next rustic release and have it fixed long-term. This would just move it further away, while we feel we 'fixed' something, but we didn't really. We just moved a well-formed fix further away.
Also, the design implies, we implement checking on some ResultCollector, which I find counterintuitive due to the refactor in #317. That should probably still be freestanding functions, so they are decoupled and easier to test.
Let's rather do it right one time with a bit more effort now, than doing it more than once, just to be quick and dirty. π
- situations where we want to collect errors, but continue, i.e. we want to return a result and a error list
For this situation, can you give an example, because this is worth to explore why it is like that. There could be many reasons for it, e.g. architectural problems, prolonged error handling, instead of doing it locally, etc.
When it comes to collecting errors in a thread, we can go for a channel and handle it in the main thread.
- situations where we want to collect errors, but continue, i.e. we want to return a result and a error list
For this situation, can you give an example, because this is worth to explore why it is like that. There could be many reasons for it, e.g. architectural problems, prolonged error handling, instead of doing it locally, etc.
An example is Repository::get_snapshots. This loads all Snapshotfiles from the repository and decrypts them into a `Vec<Snapshotfile>'.
There are errors like not being able to list snapshots which will cause this command to fail. But each individual snapshot may fail either to load from backend or to decrypt (e.g. data corruption). Currently, this functions fails if any snapshot fails to load+decrypt. This means if we have a corrupt snapshot file, the repository can be basically no longer used until this is fixed.
We however may want to introduce resiliency: It is e.g. very nasty if the whole repository is not accessible just because that old and now unimportant snapshotfile has an issue. If we introduce this, then the command would return the successful loaded snapshots and a list of failed snapshots with the error(s) - and we would at least need to indicate that something went wrong by not sending a 0 return code in some cases...
I would imagine it like this somehow (draft):
So we have a Vec<RusticError> that we return from SnapshotFile::fill_missing:
, and we reserve the Err(E) of RusticResult for hard errors. But I think this is not nice. π¬
But in the end, it's the question of why we want to return a list of the non-working snapshots and not just log an error there. What value has the information for the caller, that there are broken snapshots it cannot be operated on? We just don't want it to fail, right? If we return that Vec<RusticError> to the caller, what else will it do, than just logging it/printing to the user?
If we would want to handle the broken snapshots as well, we might rather return something like:
RusticResult<Vec<RusticResult<SnapshotFile>>> with outer Result being the operation itself that is fallible and indicates something went horribly wrong. And the inner Results being the Snapshots itself, that the caller wants to handle the Ok(T) values of and do something with the errors.