zig Add `std.json.ParseOptions.parse_numbers` to preserve float precision

Currently parsing JSON using the standard library's functions results in a loss of precision when it encounters floating point values. This pull request solves this by allowing the user of the library to specify whether they want these values parsed for them or not.

This is especially useful to those parsing JSON that includes financial numbers; these values often must be stored directly as fixed point values.

Jul 22 '24 20:07 eugene-dash

Thanks for taking the time to contribute the enhancement that you need! I just have some questions to make sure this direction is the best option for your use case. I always want to follow real actual use cases, so I appreciate you're coming to the discussion bringing up financial applications.

This is especially useful to those parsing JSON that includes financial numbers; these values often must be stored directly as fixed point values.

If you know the schema of the json, you can avoid using the dynamic std.json.Value type and parse directly into the desired representation using a custom jsonParse() method. Is that an option for your usecase?

Jul 23 '24 14:07 thejoshwolfe

The exact schema of the JSON in my use case could not be assumed, writing a custom json.Value like enum is the solution that I used to handle those variations in the schema.

I felt that a change like this would benefit the standard library; while initially reading through the standard library's documentation, It seemed that the json.Value.number_string value was underutilized. It is a very well written enum, it just lacks an interface to choose which one of it's number representations to prioritize.

I'm not apposed to retaining the status quo; the standard library does provide the tools that you need to parse these types of values yourself, and the json.Value enum is a ultimately a luxury utilizing these underlying tools.

Jul 23 '24 16:07 eugene-dash

Well put! Sounds like you've already explored the intended way to use the API, and you've encountered a use case for an enhancement! Thanks for being thorough with reading through the API docs and code!

One last question is about parsing integers as .number_string. I can't think of a case where parsing a JSON integer into a zig integer would lose any precision. -0 is considered a float by std.json.isNumberFormattedLikeAnInteger() (and subsequently by std.json.Value), and leading 0s on integers is not allowed by the JSON grammar. Is there a case for wanting a consistently unparsed integer representation in std.json.Value?

Jul 23 '24 19:07 thejoshwolfe

My reasoning for adding the option to integers, was that some floating point values would find themselves parsed as integers if they fell on whole numbers; the parse_integers option is solely a convenience so that all values can be handled the same way. However, it can be removed from the underlying pull request and not defeat it's purpose of retaining precision.

Jul 23 '24 22:07 eugene-dash

That makes sense. In that case, why not one unified option to never parse any numbers?

Jul 24 '24 00:07 thejoshwolfe

I can't answer that. In fact, if it was one unified option, I can think of three advantages.

The parsing speed would be faster while not parsing numbers, due to not needing to call std.json.isNumberFormattedLikeAnInteger for each number_string.
The json.Value.parseFromNumberSlice function would be able to keep it's previous signature since it would simply be skipped if the new option is false.
The json.static.ParseOptions struct would remain more generalized rather than having two options specifically targeting the .float and .integer values in json.dynamic.Value.

Jul 24 '24 01:07 eugene-dash

Used the zig fmt command to format dynamic.zig and dynamic_tests.zig. The files' formatting was invalid due to an unnecessary newline character and an instance of invalid indentation; hence the failed checks.

Jul 25 '24 02:07 eugene-dash

Thanks for your contribution @eugene-dash ! :heart:

Jul 26 '24 00:07 thejoshwolfe