Creates Type.Object where list.Count and keys.Count are out of sync
Trying to parse this incorrect JSON string:
{"Parent":{"MyFloat":12,8}}
does not result in an error. Instead, it returns a Type.Object where the sizes of list and keys differ (due to SafeAddChild() being called at some point). Is this something the user should expect and intercept/check in his code? My understanding is that a Type.Object is always a key/value pair, not just a key, so these List<>'s should always be the same size.
The source of that JSON test string is an incorrect export from some other tool (=localization misinterpretation of the value "12.8"). For "12,8", I'd expect maybe a partial JSON being returned (with MyFloat==12), but also yielding an error due to the dangling "8". I tried https://jsonformatter.curiousconcept.com/ , and it detects the error for all RFC specs.
A previous version of JSONObject we used (I think it was somewhere around 1.4) also did not throw an error, but did at least return a (syntactically) correct object:
{
"Parent":{
"MyFloat":12,
"MyFloat":8
}
}
The current version (641837469bc5d72a176e050dcdb8ea45e7f350d7 from 2022/02/02) outputs this with JSONChecker:
{
"Parent":{
"MyFloat":12
}
}
, but also does not indicate an error, therefore silently ignoring the incorrect "8", and returning the misaligned/unexpected JSONObject.
JSONChecker claims everything is fine, because Stringify() explicitly checks whether keys[index] is still in range, and simply bails out if it isn't.
EDIT: Further experimenting: This equally malformed JSON (=lhs of colon must not be a number):
{"Parent":{"MyFloat":12,8:""}}
is interpreted by JSONChecker as:
{
"Parent":{
"MyFloat":12,
"MyFloat":""
}
}
, without recognizing the error, and which is also an incorrect interpretation of the input string.
Hi there! Thanks for reaching out. Unfortunately, my answer here may not be very satisfying, but as I explained in my response to #48, this behavior is "by design." I treat the response to malformed JSON like this as "undefined behavior." While I admit this is the only JSON library I'm aware of that doesn't validate the input, I chose to go a different direction here (for better or worse, >10 years ago 🙃). The thinking, as I recall, was simply to "do the best we could" with invalid or impartial JSON objects. Previous versions had a strict argument which did attempt to do some validation, but it didn't catch everything (including the case you brought up here), and didn't get a lot of testing since any projects of mine that used JSONObject (all of which are now defunct) set it to false.
The reason the behavior changed recently is that I completely rewrote the parsing logic for version 2.1 (hence the minor version bump). The first version split the input into a bunch of substrings, which was easier to reason about but terrible for performance. This algorithm now advances a character index, deciding what to do as it encounters each new character of the input string. No allocating extra strings, no re-treading earlier sections of the string. AFAICT it's a much better approach, but unfortunately it means a totally new set of "undefined behavior" for malformed input.
My understanding is that a Type.Object is always a key/value pair, not just a key, so these List<>'s should always be the same size.
Yes, this is true for valid JSON. But, as you've found, there's no logic to explicitly ensure that the keys list is expanded for object-type objects. So if you expect that you'll be giving the parser invalid JSON, you should expect the objects that come out to potentially be invalid.
A previous version of JSONObject we used (I think it was somewhere around 1.4) also did not throw an error, but did at least return a (syntactically) correct object
Is that really correct? You shouldn't be able to have two fields of the same name, and in this case you wouldn't be able to get that second field out of the object with the foo["key"] accessor or anything like that.
I guess my question to you would be: how would you like the parser to behave in this situation? If you want the old behavior, I'd encourage you to just stick with the 1.4 version you were using previously (sorry I wasn't good about tagging prior to 2.0), version 2.0, or the last version to use the old algorithm. It sits between 2.0 and 2.1, and the next commit is helpfully described as "Rewrite and optimize parser." Granted, the commit I linked is called "WIP performance optimizations in parsing" so I can't guarantee it's bug free! 😆
Otherwise, I'd be curious to know your ideal response to this situation. I think this comment at the beginning of your message may be the most imporant:
The source of that JSON test string is an incorrect export from some other tool (=localization misinterpretation of the value "12.8")
We had some prior issues with folks in Europe getting the same output from JSONObject once Unity updated the runtime to properly support localization, which is why JSONObject now uses CultureInfo.InvariantCulture when converting numbers. I wonder if we could add an argument to say "expect commas for decimals" in order to parse this correctly? I'm loath to change the parsing logic in any way to cover the case of invalid JSON, especially if it has an impact on performance or the ability to parse valid JSON. But I wonder if there's an "unobtrusive" or "opt-in" way to handle this? Since I'm already intentionally deviating from standard behavior (pretty much everything else would just error out on this) I'm open to more non-standard solutions. :)
Speaking of standard behavior, if you do want proper validation, I'd recommend Newtonsoft Json or, at this point, just using the built-in JsonUtility. In fact, the relatively new Unity Serialization package includes a JsonObject class which works very similarly to this one--you don't need a C# type to map to the data you are parsing. You can just deserialize to a JsonObject and access fields via a string indexer which returns an object and eventually you'll get an object you can cast to string or float or whatever to read the data. As far as I know, all of these options will validate the input as they parse it and error out for invalid JSON.