big query quality check query does not include project and dataset in the FROM clause
Testing my-contract-name.yaml
ERROR:soda.scan:[13:13:27] Query execution error in 10.bigquery.[my-table-name].aggregation[0]: 400 POST https://bigquery.googleapis.com/bigquery/v2/projects/[my-project-id]/queries?prettyPrint=false: Syntax error: Unexpected "[my-table-name]" at [9:6]
SELECT
COUNT(CASE WHEN date IS NULL THEN 1 END),
COUNT(CASE WHEN field-name-1 IS NULL THEN 1 END),
COUNT(CASE WHEN NOT (field-name-1 IS NULL) AND NOT (LENGTH(country_code) >= 2) THEN 1 END),
COUNT(CASE WHEN NOT (field-name-1 IS NULL) AND NOT (LENGTH(country_code) <= 2) THEN 1 END),
COUNT(CASE WHEN field-name-2 IS NULL THEN 1 END),
COUNT(CASE WHEN reporting_time IS NULL THEN 1 END),
COUNT(CASE WHEN field-name-3 IS NULL THEN 1 END)
FROM [my-table-name]
the query is missing the project-id and dataset name in the table's qualified name.
my contract looks like the following
dataContractSpecification: 1.1.0
kind: DataContract
id: my-contract-id
info:
title: my-contract-title
version: 0.0.1
# Servers
servers:
[my-project-id]/[my-dataset-name]:
type: bigquery
project: [my-project-id]
dataset: [my-dataset]
models:
[my-table-name]:
type: table
fields:
date:
type: date
required: true
field-name-1:
type: string
required: true
maxLength: 2
minLength: 2
field-name-2:
type: float
required: true
reporting_time:
type: date
required: true
field-name-3:
type: string
required: true
```
Which version of CLI are you using?
@jochenchrist datacontract-cli==0.10.24
Sorry, I cannot reproduce this.
This works for me:
datacontract.yaml
dataContractSpecification: 1.1.0
id: bigquery
info:
title: bigquery
version: 0.0.1
owner: my-domain-team
servers:
my-dataproduct/bigquery:
type: bigquery
project: datameshexample-product
dataset: datacontract_cli
models:
complex_table:
type: table
fields:
some_string:
type: string
required: true
maxLength: 2
minLength: 2
some_record:
type: record
fields:
some_field_1:
type: string
some_field_2:
type: string
some_array_of_strings:
type: array
items:
type: string
some_array_of_records:
type: array
items:
type: record
fields:
some_other_field_1:
type: string
some_other_field_2:
type: string
some_json:
type: text
config:
bigqueryType: json
some_range_of_timestamp:
type: record
config:
bigqueryType: RANGE<TIMESTAMP>
Command
export DATACONTRACT_BIGQUERY_ACCOUNT_INFO_JSON_PATH=~/Temp/bigquery/datameshexample-product-b74999d7c78e.json
uvx --from 'datacontract-cli[bigquery]==0.10.24' datacontract test datacontract.yaml --logs
Do you have any special characters in your table or project name?
Do you have any special characters in your table or project name?
no, we just have dashes and numbers.
Next:
- Please double-check that your service account (as defined in the DATACONTRACT_BIGQUERY_ACCOUNT_INFO_JSON_PATH) has the correct permissions to access the tables
- Please run the soda command from the error message above directly in the BigQuery web console:
SELECT
COUNT(CASE WHEN date IS NULL THEN 1 END),
COUNT(CASE WHEN field-name-1 IS NULL THEN 1 END),
COUNT(CASE WHEN NOT (field-name-1 IS NULL) AND NOT (LENGTH(country_code) >= 2) THEN 1 END),
COUNT(CASE WHEN NOT (field-name-1 IS NULL) AND NOT (LENGTH(country_code) <= 2) THEN 1 END),
COUNT(CASE WHEN field-name-2 IS NULL THEN 1 END),
COUNT(CASE WHEN reporting_time IS NULL THEN 1 END),
COUNT(CASE WHEN field-name-3 IS NULL THEN 1 END)
FROM [my-table-name]
@jochenchrist so here is what I did
i ran the following command:
datacontract import --format bigquery --bigquery-project data-sandbox-354716 --bigquery-dataset hausdemo --bigquery-table [MY-TABLE-NAME] --dialect bigquery
it returned the following
dataContractSpecification: 1.1.0
id: my-data-contract-id
info:
title: My Data Contract
version: 0.0.1
models:
[my-table-name]:
type: table
fields:
[date-field-1]:
type: date
required: false
country_code:
type: string
required: false
[numeric-field-name]:
type: float
required: false
[date-field-2]:
type: date
required: false
[string-field-1]:
type: string
required: false
I changed the required properties from false to true, and added the server block to it, and saved it to a file called my_table_name_minimal.yaml
dataContractSpecification: 1.1.0
id: my-data-contract-id
info:
title: My Data Contract
version: 0.0.1
servers:
data-sandbox-354716/hausdemo:
type: bigquery
project: data-sandbox-354716
dataset: hausdemo
models:
[my-table-name]:
type: table
fields:
[date-field-1]:
type: date
required: false
[string-field-1]:
type: string
required: true
[numeric-field-name]:
type: float
required: true
[date-field-2]:
type: date
required: true
[string-field-2]:
type: string
required: true
then I got the following when I ran datacontract test my_table_name_minimal.yaml
(venv) jack@Jack-Enloe-MacBook-Pro contracts % datacontract test my_table_name_minimal.yaml
Testing my_table_name_minimal.yaml
ERROR:soda.scan:[10:17:47] Query execution error in 2.bigquery.my_table_name.aggregation[0]: 400 POST https://bigquery.googleapis.com/bigquery/v2/projects/data-sandbox-354716/queries?prettyPrint=false: Syntax error: Unexpected "my_table_name" at [7:6]
SELECT
COUNT(CASE WHEN date IS NULL THEN 1 END),
COUNT(CASE WHEN string_col_1 IS NULL THEN 1 END),
COUNT(CASE WHEN number_col_1 IS NULL THEN 1 END),
COUNT(CASE WHEN reporting_time IS NULL THEN 1 END),
COUNT(CASE WHEN string_col_1 IS NULL THEN 1 END)
FROM my_table_name
ERROR:soda.scan: | 400 POST https://bigquery.googleapis.com/bigquery/v2/projects/data-sandbox-354716/queries?prettyPrint=false: Syntax error: Unexpected "my_table_name" at [7:6]
ERROR:soda.scan:[10:17:48] Metrics 'missing_count' were not computed for check 'missing_count(date) = 0'
ERROR:soda.scan:[10:17:48] Metrics 'missing_count' were not computed for check 'missing_count(string_col_1) = 0'
ERROR:soda.scan:[10:17:48] Metrics 'missing_count' were not computed for check 'missing_count(number_col_1) = 0'
ERROR:soda.scan:[10:17:48] Metrics 'missing_count' were not computed for check 'missing_count(reporting_time) = 0'
ERROR:soda.scan:[10:17:48] Metrics 'missing_count' were not computed for check 'missing_count(string_col_1) = 0'
ERROR:soda.scan:[10:17:48] Query execution error in 2.bigquery.my_table_name.aggregation[0]: 400 POST https://bigquery.googleapis.com/bigquery/v2/projects/data-sandbox-354716/queries?prettyPrint=false: Syntax error: Unexpected "my_table_name" at [7:6]
SELECT
COUNT(CASE WHEN date IS NULL THEN 1 END),
COUNT(CASE WHEN string_col_1 IS NULL THEN 1 END),
COUNT(CASE WHEN number_col_1 IS NULL THEN 1 END),
COUNT(CASE WHEN reporting_time IS NULL THEN 1 END),
COUNT(CASE WHEN string_col_1 IS NULL THEN 1 END)
FROM my_table_name
ERROR:soda.scan: | 400 POST https://bigquery.googleapis.com/bigquery/v2/projects/data-sandbox-354716/queries?prettyPrint=false: Syntax error: Unexpected "my_table_name" at [7:6]
ERROR:soda.scan:[10:17:48] Metrics 'missing_count' were not computed for check 'missing_count(date) = 0'
ERROR:soda.scan:[10:17:48] Metrics 'missing_count' were not computed for check 'missing_count(string_col_1) = 0'
ERROR:soda.scan:[10:17:48] Metrics 'missing_count' were not computed for check 'missing_count(number_col_1) = 0'
ERROR:soda.scan:[10:17:48] Metrics 'missing_count' were not computed for check 'missing_count(reporting_time) = 0'
ERROR:soda.scan:[10:17:48] Metrics 'missing_count' were not computed for check 'missing_count(string_col_1) = 0'
WARNING:root:Engine soda-core has errors. See the logs for details.
╭─────────┬───────────────────────────────────────────────────────┬──────────────────────────────┬────────────────────────────────────────────────────╮
│ Result │ Check │ Field │ Details │
├─────────┼───────────────────────────────────────────────────────┼──────────────────────────────┼────────────────────────────────────────────────────┤
│ │ Check that field string_col_1 has no missing values │ my_table_name.string_col_1 │ │
│ │ Check that field string_col_1 has no missing values │ my_table_name.string_col_1 │ │
│ │ Check that field date has no missing values │ my_table_name.date │ │
│ │ Check that field reporting_time has no missing values │ my_table_name.reporting_time │ │
│ │ Check that field number_col_1 has no missing values │ my_table_name.number_col_1 │ │
│ passed │ Check that field 'string_col_1' is present │ my_table_name.string_col_1 │ │
│ passed │ Check that field string_col_1 has type STRING │ my_table_name.string_col_1 │ │
│ passed │ Check that field 'string_col_1' is present │ my_table_name.string_col_1 │ │
│ passed │ Check that field string_col_1 has type STRING │ my_table_name.string_col_1 │ │
│ passed │ Check that field 'date' is present │ my_table_name.date │ │
│ passed │ Check that field date has type DATE │ my_table_name.date │ │
│ passed │ Check that field 'reporting_time' is present │ my_table_name.reporting_time │ │
│ passed │ Check that field reporting_time has type DATE │ my_table_name.reporting_time │ │
│ passed │ Check that field 'number_col_1' is present │ my_table_name.number_col_1 │ │
│ passed │ Check that field number_col_1 has type FLOAT64 │ my_table_name.number_col_1 │ │
│ warning │ Data Contract Tests │ │ Engine soda-core has errors. See the logs for │
│ │ │ │ details. │
╰─────────┴───────────────────────────────────────────────────────┴──────────────────────────────┴────────────────────────────────────────────────────╯
🟠 data contract has warnings. Found the following warnings:
1) my_table_name.date Check that field date has no missing values: None
2) my_table_name.string_col_1 Check that field string_col_1 has no missing values: None
3) my_table_name.number_col_1 Check that field number_col_1 has no missing values: None
4) my_table_name.reporting_time Check that field reporting_time has no missing values: None
5) my_table_name.string_col_1 Check that field string_col_1 has no missing values: None
6) Data Contract Tests: Engine soda-core has errors. See the logs for details.
Should I try a previous version maybe?
my project requirements are the following
(venv) jack@Jack-Enloe-MacBook-Pro data-contracts-prototype % pip freeze
absl-py==2.2.2
aenum==3.1.15
agate==1.9.1
aiobotocore==2.19.0
aiofiles==24.1.0
aiohappyeyeballs==2.4.0
aiohttp==3.10.5
aioitertools==0.12.0
aiosignal==1.3.1
alembic==1.13.2
altair==4.2.2
annotated-types==0.7.0
ansicolors==1.1.8
antlr4-python3-runtime==4.11.1
anyio==4.4.0
appnope==0.1.4
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
arviz==0.21.0
asana==5.0.10
asn1crypto==1.5.1
astroid==3.3.9
asttokens==2.4.1
astunparse==1.6.3
async-lru==2.0.4
async-timeout==4.0.3
attrs==24.2.0
Authlib==1.3.2
avro==1.12.0
azure-core==1.33.0
azure-identity==1.17.1
babel==2.17.0
backcall==0.2.0
backports.tarfile==1.2.0
beautifulsoup4==4.13.3
bleach==6.2.0
blinker==1.9.0
boto3==1.36.3
botocore==1.36.3
build==1.2.2.post1
CacheControl==0.14.0
cachetools==5.5.0
certifi==2024.8.30
cffi==1.17.0
cfgv==3.4.0
charset-normalizer==3.3.2
cleo==2.1.0
click==8.1.7
click-default-group==1.2.4
cloud-sql-python-connector==1.15.0
cloudevents==1.11.0
cloudpickle==2.2.1
cmdstanpy==1.2.4
colorama==0.4.6
comm==0.2.2
cons==0.4.6
contourpy==1.3.0
convertdate==2.4.0
coverage==7.6.10
cramjam==2.10.0
crashtest==0.4.1
crcmod==1.7
cryptography==43.0.1
curlify==2.2.1
customerlift==0.1.47
cycler==0.12.1
Cython==3.0.11
daff==1.3.46
databricks-sdk==0.49.0
databricks-sql-connector==4.0.2
datacontract-cli==0.10.24
db-dtypes==1.4.2
dbt-adapters==1.14.6
dbt-common==1.22.0
dbt-core==1.9.4
dbt-extractor==0.6.0
dbt-semantic-interfaces==0.7.4
debugpy==1.8.12
decorator==5.1.1
deepdiff==7.0.1
defusedxml==0.7.1
Deprecated==1.2.14
deprecation==2.1.0
dill==0.3.9
distlib==0.3.9
dnspython==2.6.1
docopt==0.6.2
docstring_parser==0.16
duckdb==1.0.0
dulwich==0.21.7
durationpy==0.9
dynaconf==3.2.7
email_validator==2.2.0
entrypoints==0.4
ephem==4.1.5
estimation==0.4.6
et_xmlfile==2.0.0
etuples==0.3.9
exceptiongroup==1.2.2
executing==2.1.0
facebook-business==10.0.1
fastapi==0.115.12
fastjsonschema==2.20.0
fastparquet==2024.11.0
filelock==3.15.4
firebase-admin==5.4.0
Flask==3.1.0
Flask-Cors==3.0.10
Flask-Migrate==4.0.7
Flask-SQLAlchemy==3.1.1
flask-talisman==1.1.0
flatbuffers==25.2.10
fonttools==4.53.1
forecasting==0.1.19
formulaic==1.0.2
fpdf==1.7.2
fqdn==1.5.1
frozenlist==1.4.1
fsspec==2025.2.0
functions-framework==3.8.2
gast==0.6.0
gcloud==0.18.3
gcsfs==2024.10.0
geolift-metadata-store==1.3.21
google-api-core==2.19.2
google-api-python-client==2.143.0
google-auth==2.39.0
google-auth-httplib2==0.2.0
google-auth-oauthlib==1.2.1
google-cloud==0.34.0
google-cloud-aiplatform==1.64.0
google-cloud-appengine-logging==1.4.5
google-cloud-audit-log==0.3.0
google-cloud-bigquery==3.31.0
google-cloud-bigquery-datatransfer==3.17.1
google-cloud-bigquery-storage==2.25.0
google-cloud-core==2.4.3
google-cloud-firestore==2.18.0
google-cloud-kms==2.24.2
google-cloud-logging==3.11.4
google-cloud-pubsub==2.26.1
google-cloud-resource-manager==1.12.5
google-cloud-secret-manager==2.22.1
google-cloud-storage==3.1.0
google-cloud-tasks==2.19.2
google-crc32c==1.6.0
google-pasta==0.2.0
google-resumable-media==2.7.2
googleapis-common-protos==1.65.0
great-expectations==0.18.22
grpc-google-iam-v1==0.14.2
grpcio==1.71.0
grpcio-status==1.62.3
grpcio-tools==1.71.0
gunicorn==23.0.0
h11==0.14.0
h5netcdf==1.3.0
h5py==3.11.0
haus-notifications==1.0.12
hijri-converter==2.3.1
holidays==0.18
httpcore==1.0.5
httplib2==0.22.0
httpx==0.27.2
hypothesis==6.125.2
identify==2.6.3
idna==3.8
importlib-metadata==6.11.0
inflect==7.5.0
iniconfig==2.0.0
installer==0.7.0
interface-meta==1.3.0
ip3country==0.3.0
ipykernel==6.29.5
ipython==8.12.3
ipywidgets==8.1.5
iso3166==2.1.1
isodate==0.6.1
isoduration==20.11.0
isort==6.0.1
itsdangerous==2.2.0
jaraco.classes==3.4.0
jaraco.context==6.0.1
jaraco.functools==4.1.0
jedi==0.19.1
Jinja2==3.1.5
jinja_partials==0.2.1
jmespath==1.0.1
joblib==1.4.2
json5==0.10.0
jsonpatch==1.33
jsonpointer==3.0.0
jsonschema==4.23.0
jsonschema-specifications==2023.12.1
jupyter-events==0.12.0
jupyter-lsp==2.2.5
jupyter_client==8.6.2
jupyter_core==5.7.2
jupyter_server==2.15.0
jupyter_server_terminals==0.5.3
jupyterlab==4.4.0
jupyterlab_pygments==0.3.0
jupyterlab_server==2.27.3
jupyterlab_widgets==3.0.13
keras==3.9.2
keyring==25.6.0
keyrings.google-artifactregistry-auth==1.1.2
kiwisolver==1.4.6
korean-lunar-calendar==0.3.1
kubernetes==32.0.1
leather==0.4.0
libclang==18.1.1
lightgbm==4.6.0
linearmodels==5.4
logical-unification==0.4.6
LunarCalendar==0.0.9
lz4==4.4.4
mailchimp-transactional==1.0.56
makefun==1.15.6
Mako==1.3.5
Markdown==3.8
markdown-it-py==3.0.0
MarkupSafe==2.1.5
marshmallow==3.26.1
mashumaro==3.14
matplotlib==3.10.1
matplotlib-inline==0.1.7
mccabe==0.7.0
mdurl==0.1.2
metaflow==2.15.7
metaflow-netflixext==1.2.3
miniKanren==1.0.3
mistune==3.1.1
ml_dtypes==0.5.1
mmh3==5.1.0
mockito==1.5.4
more-itertools==10.5.0
mpmath==1.3.0
msal==1.32.0
msal-extensions==1.3.1
msgpack==1.0.8
multidict==6.4.3
multimethod==1.10
multipledispatch==1.0.0
mypy-extensions==1.0.0
mysqlclient==2.2.4
namex==0.0.8
narwhals==1.35.0
nbclient==0.10.0
nbconvert==7.16.6
nbformat==5.10.4
nest-asyncio==1.6.0
networkx==3.4.2
nodeenv==1.9.1
notebook==7.4.0
notebook_shim==0.2.4
notion-client==2.2.1
numpy==1.26.4
oauth2client==4.1.3
oauthlib==3.2.2
openpyxl==3.1.5
opentelemetry-api==1.32.1
opentelemetry-exporter-otlp-proto-common==1.32.1
opentelemetry-exporter-otlp-proto-http==1.32.1
opentelemetry-proto==1.32.1
opentelemetry-sdk==1.32.1
opentelemetry-semantic-conventions==0.53b1
opt_einsum==3.4.0
optree==0.15.0
ordered-set==4.1.0
overrides==7.7.0
packaging==24.2
pandas==2.2.3
pandas-gbq==0.19.2
pandera==0.18.3
pandocfilters==1.5.1
papermill==2.6.0
parsedatetime==2.6
parso==0.8.4
pathspec==0.12.1
patsy==0.5.6
pexpect==4.9.0
pickleshare==0.7.5
pillow==10.4.0
pipreqs==0.5.0
pkginfo==1.12.0
platformdirs==4.2.2
plotly==6.0.1
pluggy==1.5.0
poetry==1.8.5
poetry-core==1.9.1
poetry-plugin-export==1.8.0
pre_commit==4.0.1
prometheus_client==0.21.1
prompt_toolkit==3.0.47
propcache==0.2.1
prophet==1.1.1
proto-plus==1.24.0
protobuf==5.29.3
psutil==6.1.1
psycopg2-binary==2.9.10
ptyprocess==0.7.0
pure_eval==0.2.3
py4j==0.10.9.7
pyactiveresource==2.2.2
pyarrow==19.0.1
pyasn1==0.6.0
pyasn1_modules==0.4.0
pycountry==24.6.1
pycparser==2.22
pydantic==2.10.6
pydantic_core==2.27.2
pydata-google-auth==1.8.2
pydbml==1.2.0
Pygments==2.18.0
pyhaus-messaging==0.3.2
pyhdfe==0.2.0
pyiceberg==0.8.1
PyJWT==2.9.0
pylint==3.3.6
pymc==5.22.0
pymc-marketing==0.11.1
PyMeeus==0.5.12
PyMySQL==1.1.1
pyodbc==5.2.0
pyOpenSSL==24.3.0
pyparsing==3.1.4
pyproject_hooks==1.2.0
pyprojroot==0.3.0
pyspark==3.5.5
pytensor==2.30.3
pytest==8.3.2
pytest-cov==6.0.0
pytest-flask==1.3.0
pytest-mock==3.14.0
pytest-reverse==1.8.0
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
python-json-logger==3.2.1
python-multipart==0.0.20
python-slugify==8.0.4
pytimeparse==1.1.8
pytz==2024.1
PyYAML==6.0.2
pyzmq==26.2.0
RapidFuzz==3.10.1
rdflib==7.0.0
redis==4.6.0
referencing==0.35.1
regiontest==15.10.18
requests==2.32.3
requests-mock==1.12.1
requests-oauthlib==2.0.0
requests-toolbelt==1.0.0
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rich==13.8.0
rpds-py==0.20.0
rsa==4.9
ruamel.yaml==0.17.40
ruamel.yaml.clib==0.2.12
s3fs==2025.2.0
s3transfer==0.11.3
scikit-learn==1.6.1
scipy==1.11.4
seaborn==0.13.2
Send2Trash==1.8.3
sentry-sdk==2.20.0
setuptools-git==1.2
setuptools-scm==8.1.0
shapely==2.0.6
shellingham==1.5.4
ShopifyAPI==12.6.0
six==1.16.0
sniffio==1.3.1
snowflake-connector-python==3.12.1
snowplow-tracker==1.1.0
soda-core==3.5.2
soda-core-bigquery==3.5.2
soda-core-contracts==3.5.2
soda-core-duckdb==3.5.2
soda-core-postgres==3.5.2
soda-core-snowflake==3.5.2
soda-core-spark==3.5.2
soda-core-spark-df==3.5.2
soda-core-sqlserver==3.5.2
soda-core-trino==3.5.2
sortedcontainers==2.4.0
soupsieve==2.6
SQLAlchemy==2.0.33
sqlglot==26.15.0
sqlparse==0.5.3
stack-data==0.6.3
stanio==0.5.1
starlette==0.46.2
statsig==0.41.0
statsmodels==0.14.4
strictyaml==1.7.3
sympy==1.13.1
tabulate==0.9.0
tenacity==8.5.0
tensorboard==2.19.0
tensorboard-data-server==0.7.2
tensorflow==2.19.0
tensorflow-io-gcs-filesystem==0.37.1
termcolor==2.4.0
terminado==0.18.1
text-unidecode==1.3
threadpoolctl==3.5.0
thrift==0.20.0
tinycss2==1.4.0
tomli==2.0.1
tomlkit==0.13.2
toolz==0.12.1
torch==2.6.0
tornado==6.4.1
tqdm==4.66.5
traitlets==5.14.3
trino==0.333.0
trove-classifiers==2024.10.21.16
typeguard==4.3.0
typer==0.15.2
types-python-dateutil==2.9.0.20241206
typing-inspect==0.9.0
typing_extensions==4.12.2
tzdata==2024.1
tzlocal==5.2
ua-parser==0.18.0
ua-parser-builtins==0.18.0.post1
uri-template==1.3.0
uritemplate==4.1.1
urllib3==2.2.2
uvicorn==0.34.0
virtualenv==20.28.0
watchdog==6.0.0
wcwidth==0.2.13
webcolors==24.11.1
webencodings==0.5.1
websocket-client==1.8.0
Werkzeug==3.1.3
widgetsnbextension==4.0.13
wrapt==1.16.0
xarray==2025.3.1
xarray-einstats==0.7.0
xattr==1.1.0
xgboost==3.0.0
yarg==0.1.9
yarl==1.9.7
zipp==3.20.1
zstandard==0.23.0
(venv) jack@Jack-Enloe-MacBook-Pro data-contracts-prototype %
@jochenchrist
this query, from soda will not work
SELECT
COUNT(CASE WHEN date IS NULL THEN 1 END),
COUNT(CASE WHEN field-name-1 IS NULL THEN 1 END),
COUNT(CASE WHEN NOT (field-name-1 IS NULL) AND NOT (LENGTH(country_code) >= 2) THEN 1 END),
COUNT(CASE WHEN NOT (field-name-1 IS NULL) AND NOT (LENGTH(country_code) <= 2) THEN 1 END),
COUNT(CASE WHEN field-name-2 IS NULL THEN 1 END),
COUNT(CASE WHEN reporting_time IS NULL THEN 1 END),
COUNT(CASE WHEN field-name-3 IS NULL THEN 1 END)
FROM [my-table-name]
in order to get soda checks to work directly using soda, I needed to have a checks.yml of the following
checks for data-sandbox-354716.hausdemo.my_table_name:
- row_count > 0
- missing_count(date) = 0
- missing_count(string_col_1) = 0
- missing_count(number_col_1) = 0
- missing_count(reporting_time) = 0
- missing_count(string_col_2) = 0
- min(number_col_1) >= 0
- max_length(string_col_1) = 2
- min_length(string_col_1) = 2
the query produced by datacontract-cl will only work when I add the project, and dataset to the table name
SELECT
COUNT(CASE WHEN date IS NULL THEN 1 END),
COUNT(CASE WHEN field-name-1 IS NULL THEN 1 END),
COUNT(CASE WHEN NOT (field-name-1 IS NULL) AND NOT (LENGTH(country_code) >= 2) THEN 1 END),
COUNT(CASE WHEN NOT (field-name-1 IS NULL) AND NOT (LENGTH(country_code) <= 2) THEN 1 END),
COUNT(CASE WHEN field-name-2 IS NULL THEN 1 END),
COUNT(CASE WHEN reporting_time IS NULL THEN 1 END),
COUNT(CASE WHEN field-name-3 IS NULL THEN 1 END)
FROM data-sandbox-354716.hausdemo.[my-table-name]
the service account referenced by DATACONTRACT_BIGQUERY_ACCOUNT_INFO_JSON_PATH is a bigquery admin
For reference: The issue was related to a table name, starting with a numeric value, e.g. "1abced", which must be escaped according to https://cloud.google.com/bigquery/docs/reference/standard-sql/lexical
Solution would be to enclose all BigQuery identifiers with backticks.