Applying a filter to a row_count 'same as' comparison aggregation
Describe the bug
Applying a filter to a row_count check only applies to the metric, not the external column we're evaluating against.
I can see why this is expected behaviour, but I would like a way to optionally apply the filter to the comparison aggregation.
To Reproduce
Steps to reproduce the behavior:
- Create a new test in
scan.yml
filter foo [hourly]:
where: from_iso8601_timestamp(time) between DATE_ADD('hour', -1, NOW()) AND NOW()
checks for foo [hourly]:
- row_count same as bar
-
Run
soda scan with -V verbose flag -
Observe the "otherRowcount" SQL doesn't include the filter and so counts all rows
Soda Core 3.0.4
Reading configuration file "configuration.yml"
Reading SodaCL file "checks_foo.yml"
Scan execution starts
Query datalake_dev.foo[hourly].aggregation[0]:
SELECT
COUNT(*)
FROM datalake.foo
WHERE from_iso8601_timestamp(time) between DATE_ADD('hour', -1, NOW()) AND NOW()
Query datalake_dev.bar.aggregation[0]:
SELECT
COUNT(*)
FROM datalake.bar
Scan summary:
2/2 queries OK
datalake_dev.foo[hourly].aggregation[0] [OK] 0:00:01.819557
datalake_dev.bar.aggregation[0] [OK] 0:00:01.262278
1/1 check FAILED:
foo [hourly] in datalake_dev
row_count same as bar [FAILED]
value: -12047
rowCount: 9
otherRowCount: 12056
Oops! 1 failures. 0 warnings. 0 errors. 0 pass.
**Context**
As above
**OS**: MAC
**Python Version**: 3.8.9
**Soda SQL Version**: 3.0.4
**Warehouse Type**: AWS Athena
Sorry, I've opened this as a bug, but it's a feature request
Hi @paulskipprhudson, Soda SQL will soon be deprecated in favor of Soda Core. It seems to me your issue is actually related to Soda Core rather than to Soda SQL. May I ask you to move it to https://github.com/sodadata/soda-core/issues? Many thanks!