Add `std.json.ParseOptions.parse_numbers` to preserve float precision
Currently parsing JSON using the standard library's functions results in a loss of precision when it encounters floating point values. This pull request solves this by allowing the user of the library to specify whether they want these values parsed for them or not.
This is especially useful to those parsing JSON that includes financial numbers; these values often must be stored directly as fixed point values.
Thanks for taking the time to contribute the enhancement that you need! I just have some questions to make sure this direction is the best option for your use case. I always want to follow real actual use cases, so I appreciate you're coming to the discussion bringing up financial applications.
This is especially useful to those parsing JSON that includes financial numbers; these values often must be stored directly as fixed point values.
If you know the schema of the json, you can avoid using the dynamic std.json.Value type and parse directly into the desired representation using a custom jsonParse() method. Is that an option for your usecase?
The exact schema of the JSON in my use case could not be assumed, writing a custom json.Value like enum is the solution that I used to handle those variations in the schema.
I felt that a change like this would benefit the standard library; while initially reading through the standard library's documentation, It seemed that the json.Value.number_string value was underutilized. It is a very well written enum, it just lacks an interface to choose which one of it's number representations to prioritize.
I'm not apposed to retaining the status quo; the standard library does provide the tools that you need to parse these types of values yourself, and the json.Value enum is a ultimately a luxury utilizing these underlying tools.
Well put! Sounds like you've already explored the intended way to use the API, and you've encountered a use case for an enhancement! Thanks for being thorough with reading through the API docs and code!
One last question is about parsing integers as .number_string. I can't think of a case where parsing a JSON integer into a zig integer would lose any precision. -0 is considered a float by std.json.isNumberFormattedLikeAnInteger() (and subsequently by std.json.Value), and leading 0s on integers is not allowed by the JSON grammar. Is there a case for wanting a consistently unparsed integer representation in std.json.Value?
My reasoning for adding the option to integers, was that some floating point values would find themselves parsed as integers if they fell on whole numbers; the parse_integers option is solely a convenience so that all values can be handled the same way. However, it can be removed from the underlying pull request and not defeat it's purpose of retaining precision.
That makes sense. In that case, why not one unified option to never parse any numbers?
I can't answer that. In fact, if it was one unified option, I can think of three advantages.
- The parsing speed would be faster while not parsing numbers, due to not needing to call
std.json.isNumberFormattedLikeAnIntegerfor each number_string. - The
json.Value.parseFromNumberSlicefunction would be able to keep it's previous signature since it would simply be skipped if the new option is false. - The
json.static.ParseOptionsstruct would remain more generalized rather than having two options specifically targeting the.floatand.integervalues injson.dynamic.Value.
Used the zig fmt command to format dynamic.zig and dynamic_tests.zig. The files' formatting was invalid due to an unnecessary newline character and an instance of invalid indentation; hence the failed checks.
Thanks for your contribution @eugene-dash ! :heart: