data-validator icon indicating copy to clipboard operation
data-validator copied to clipboard

Thresholds parsed as JSON floats are ignored

Open colindean opened this issue 4 years ago • 1 comments

Describe the bug

When specifying a check with a threshold that will parse to a JSON float, e.g.

threshold: 0.10 # will be ignored
threshold: 10% # works
threshold: "0.10" # works

the threshold will be ignored.

To Reproduce

Configure a check with:

type: nullCheck
column: foo
threshold: 0.10

or put that into a test in NullCheckSpec.

Expected behavior

Thresholds specified as floats should work.

colindean avatar Jan 20 '22 01:01 colindean

Workaround

The workaround for now is to use one of the two working forms: use the % sign version or ensure that it's passed as a string using quotes.

Analysis

I'm able to reproduce this in a modified test and am working on the solution. It's going to change a bunch of checks so it's not a quick fix.

The crux of the problem is here:

https://github.com/target/data-validator/blob/b0929c9a6007342538fe823d1c9d4b6f78bb527e/src/main/scala/com/target/data_validator/validator/NullCheck.scala#L34

When DV's YAML parser converts to JSON internally, the threshold is represented as a float, so it needs .as[Float] there. The core problem is the discard of the Left[DecodingFailure] when the .as() fails.

DV's ConfigParserSpec doesn't fail when the thresholds are added because the checks inside of ValidatorTable are not checked in the equality assertion because ValidatorTable is an abstract class and thus its members, including the checks, aren't checked.

Impact

Since all checks parse their own config, we've gotta fix this across all checks with thresholds individually. This might be a time to centralize that configuration.

colindean avatar Jan 20 '22 01:01 colindean