Support for default and new function applyDefaultsAndValidate
Looks like json-schema-validator does not support the "default" keyword. Is it possible to have a new function applyDefaultsAndValidate that applies the default and validates the schema?
As for tests we may need
- Test where "foo" of type "int" is required and default is 5, and object to validate has no value for "foo", but validation passes. Also, the object to validate is changed in place, and the test asserts "foo" exists and is equal to 5.
- Test where "foo" of type "int" is required and default is "five", and object to validate has no value for "foo", and validation fails with error message that foo has the wrong type (the usual error message), or even better default value for foo has the wrong type. In theory this should handle all validations, like minValue, maxValue, so no more tests are needed.
There is one thing though. Suppose we are applying defaults and validating a big json and somewhere in the middle it fails fast. Then the json is partially changed. Maybe then we can just Javadoc this behavior and provide support for a listener.
I'm willing to work on this.
Thoughts?
There are two main functions for this library: Validating and Walking. From a validation perspective, it is following the specification and the test cases were in default.json for each spec version. For walker listeners, I don't know how the default keyword is handled. I am guessing you are talking about the walker enhancement to apply the default values for your JSON input based on the specification. If this is the case, I would recommend exploring the walker code and add the future there. Let me know if I am off track.
Yeah, that's correct. The use case is something like this. Someone changes the schema to add a new attribute with default value. Then if you validate an existing json object that does not have this new attribute, it ought to pass. Furthermore, it would be nice to get an updated json object that has the default values filled in.
The other use case is that you already have json objects that passed the schema and saved to the database, and now you add a new attribute with default to the schema. Unless you re-validate the json object with the new schema, it will not get the defaults. So my approach of filling in default values will not help in that use case.
There's talk about this at https://python-jsonschema.readthedocs.io/en/stable/faq/ "Why doesn’t my schema’s default property set the default on my instance?". Though I want to default validator to be the first one, so that the default value itself gets validated.
So my plan is something like this:
- Create a new validator class called something like ApplyDefault. Even though it extends JsonValidator, it is not a validator.
- Only when someone calls applyDefaultsAndValidate then install this validator into the JsonMetaSchema. Others "default" will continue to be a NonValidationKeyword.
- Arrange things so that JsonSchema.getValidators returns ApplyDefault first. That function currently returns a HashMap, so we probably want a SortedMap.
It may be nice to do this in a way of hooks, so that customers can add this behavior.
@SiemelNaran Your use cases make sense and I think your ideas should work. However, we shouldn't enable this applyDefault feature by default as it is against the specification and most users will want to see the error messages in the case data is not validate after the schema is changed. In the library, there is a configuration class that you can set a flag to enable this feature on demand. Please make sure the flag is false by default and only be enabled when you need this feature. Thanks.
There are two ways to implements this -- one is to bake it into the validate function, and another is the tree walker approach. I tried the latter as you suggested it, writing a test that installs a keyword listener on "default" and run into the following problems:
-
The function onWalkStart in JsonSchemaWalkListener takes a Node as an argument. So setting Node to
schemaNode.get("default")makes the local variable in this function correct, but it doesn't change the variablenodein the function that called this one. To solve this I tried added a function replaceNode to the WalkEvent class, and the calling function JsonSchema.walk will call keywordWalkEvent.isNodeReplaced, get the new node, etc, which is then passed on to other listeners. -
In any case, changing the value of
nodedoesn't change the input JSON. The listener could do this as it has the variableatwhich is the JSON path, and it can set the value. But it seems too much manual steps for the user. So instead when we call keywordWalkEvent.isNodeReplaced, and if is true, not only set the local variable to the new node, but change the input JSON. It got tricky as that function doesn't know the key, so we use the Collector context to pass the replaced node to the function that called this, namely PropertiesValidator.validate which then calls objectNode.set(key, replacedNode). -
In any case, the walker is only called if a value is present. So if the attribute is
"SomeAttribute": nullthen the default listener is called, and it all works. But ifSomeAttributeis completely missing then the listener is not called. The logic for this is in PropertiesValidator.validate. I think a fix for this is that if propertyNode in this function is null, then walk but only call the default listener. But it looks like a massive change to implement as I'd like the walk function to take a new function argument applyDefaults, and the changes are extensive. Is this approach on the right track? -
We need a way to ensure that the "default" listener is the first one to fire. Right now it looks like the validators as returned by getValidators() is a HashMap, so there is no order.
The other way is to somehow have a DefaultValidator. Which is the better way in your opinion?
@SiemelNaran I am not familiar with the walker feature and I hope @prashanthjos can help to validate your ideas and give you some advice.
It seems easier to do without creating a listener on the "default". Instead I added a variable shouldApplyDefaults (but then turned it into a class so that you can apply property defaults or array defaults). See https://github.com/networknt/json-schema-validator/pull/468/files.
This was resolved in pull-request #477