data-validator
data-validator copied to clipboard
Streamline configuration for the same test applied to multiple columns
Currently, if I wanted to check for null values in each of the columns (age, occupation) of a table, the checks: section of the configuration file would contain something this:
- type: nullCheck
column: age
- type: nullCheck
column: occupation
Ideally, we should support a more streamlined config. Something like:
- type: nullCheck
columns: age, occupation
We would need to decide how to handle optional parameters in the streamlined case. One option is that we do not support streamlining if any optional parameters are specified:
- type: nullCheck
column: age
threshold: 1%
- type: nullCheck
column: occupation
threshold: 5%
Another option would be to allow additional parameters to be streamlined and applied in the same order as the specified columns:
- type: nullCheck
columns: age, occupation
thresholds: 1%, 5%