Typed JSON API
Context
Currently most loader functions use the JSONOutput return type. This type is pretty non specific and a bit hard to reason about downstream. I find myself having to cast or validate the resulting type all the time.
The current loaders return JSONOutput because this file e.g. is valid JSON and would be parsed properly by ujson
test_file.json
hello
However, if we get this JSON file as an input in most of our code paths, we would want to raise an error as this is almost certainly invalid for what we want to do next.
This PR adds validation on top of the existing JSON API to ensure you get the expected type from a loader function.
If we like this approach, I can add to the rest of the API, just implementing the most commonly used functions from the JSON API for now.
Summary of Changes
Add read_json_dict - read JSON file and validate the resulting object is a dict
Add read_json_list - read JSON file and validate the resulting object is a list of dicts
Add read_jsonl_dicts - read JSONL file and validate each line is a valid dict in the resulting generator