soda-core
soda-core copied to clipboard
:zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
When we still had the trial version of Soda Cloud, we ran the following checks and they took no longer than a minute. Now, without Soda Cloud, just using Soda...
I am using programmatic soda scans to run checks against a Snowflake Dataset, but when the run has finished I can see results on the Terminal but there is an...
Wanted to notify with v3.0.3 in the cloud I just see a straight line with 0 as the value every time, even though the output of the scans shows the...
My requirements are to measure counts for a given time period (e.g. 10am-11am) over last 7 weeks (can be customised as well) for the same time period and I think...
It would be great if it was possible to provide instructions for what to do when a failure occurs. For example ```yaml - row_count: warn: when between 10 and 100...
I would like use [variables](https://docs.soda.io/soda-core/scan-reference.html#variables) on user-defined checks queries, but it doesn't work. ```yaml # ${variable_name} doesn't work on user-defined checks queries variables: TS_START: start date TS_END: end date checks...
This config fails: ```yaml filter retail_orders [daily]: where: DATE '${date}' = "order_date" # Checks for retail_orders checks for retail_orders [daily]: #checks for retail_orders: - duplicate_count(order_id) = 0 ``` With this...
Freshness does not work with a date datatype.
One common use case is to compare freshness across multiple datasets/sources in order to validate that an ETL job did not take too much time for example.
I recently realised that we were only considering the most common data types in column profiling for most dbs. I've now added support to pretty much all numeric and text...