JSON parsing with accumulative errors
When parsing an incorrect JSON with Aeson, you only get the first error instead of all the errors like so:
Data.Aeson> eitherDecode "[1,19,\"a\", 20]" :: Either String [Text]
Left "Error in $[0]: expected Text, encountered Number"
It would be better to have another function to return all the errors instead of just the first one, for example:
Data.Aeson> eitherDecode "[1,19,\"a\", 20]" :: Either [(String, String)] [Text]
Left [("$[0]", "Expected Text, encountered Number"), ("$[1]", "Expected Text, encountered Number"), ("$[3]", "Expected Text, encountered Number")]
This kind of feature would be very useful in web development where errors in request payload are all checked at once to save network round trip.
I'm thinking maybe this library can be used instead of Either https://hackage.haskell.org/package/Validation-0.2.0/docs/Data-Validation.html
Validation isn't a Monad. How would you write e.g. tagged sum encoding parser?
In general, I think both "fail fast" and "fail with lot of info" modes are useful, but complicated to combine. For untrusted input, one should fail reasonably fast, so the service isn't easily DDOSed.
I agree both modes have valid use cases. What I meant to say is to add another function to "fail with a lot of info." :)
-
To keep the current monadic parser, we might instead introduce a new combinator.
-- | Run both parsers independently. -- Combine the results if both succeed. -- Otherwise fail by accumulating errors from both branches. accLift2 :: (a -> b -> c) -> Parser a -> Parser b -> Parser c -
If that works, and the trade-off is worth it, use it throughout aeson (
[],Map, etc., and generics/TH).
It is still possible to fail quite fast: if the first parser fails, we can return the (potential) errors of the second lazily.
This would require a change in the internal IResult type to report multiple errors. I'm not sure how much of a breaking change that would be considered, but for additional backwards compatibility, we can export the original IError :: JSONPath -> String -> Result as a pattern synonym that views only the first error.
This is something I'd like to have in aeson as well.
What @Lysxia describes sounds promising!
In general, I think both "fail fast" and "fail with lot of info" modes are useful
Agreed. When it's important to fail fast in production it may still be useful to accumulate errors in development environments.
circe for scala does error accumulation but it has a few kinks that make it tricky to know if errors will accumulate properly. I think it stems from the fact that they have a separate flow for accumulation, without the fail-fast behavior @Lysxia describes.
This would require a change in the internal IResult type to report multiple errors. I'm not sure how much of a breaking change that would be considered
This sounds fine to me.
Overall, if we can keep the monadic parser it means clients only have to update custom instances, and only if they want accumulation.
PS.
Validation isn't a Monad. How would you write e.g. tagged sum encoding parser?
This is only because <*> /= ap. You can still implement everything needed for "monadic" parsers using Validation, it's just that a Monad instance should be avoided.
Work on this is progressing on the acc-errors branch and #597
Databrary will try to contribute in small ways to the PRs for this issue if we get time. We are hoping to make use of this. I will update this comment when we are actively trying to help.
#683 is also related, to some extent?