Obsolete controlflow interruption `if server.format == "json" and server.type != "kafka":`
https://github.com/datacontract/datacontract-cli/blob/252c4e17b7ffb99e7a823d42015a2a08fa595482/datacontract/engines/data_contract_test.py#L53
The check_jsonschema doesn't support azure ou gcp, and it duplicates the check_soda_execute with json that is compatible with azure, s3 etc .., don't respect the DRY
The JSON Schema Check is meant for complex JSON structures (arrays and nested fields). The check_soda_execute will only work on top-level fields.
Would you be interested in contributing better Azure ang GCP support here?
Ok I need to check if jsonlines (jsonl) new_lines with gun zip compression is supported (mypayload.jsonl.gz), I'll comeback soon, and I see there is a PR to add azure storage account to jsonschema_check too.
@dmaresma any update?
@jochenchrist with the current version of Duckdb the 1.0.0 the duckdb connectivity on Azure fail, the
con.sql(f"""
CREATE SECRET azure_spn (
TYPE AZURE,
PROVIDER service_principal,
TENANT_ID '{tenant_id}',
CLIENT_ID '{client_id}',
CLIENT_SECRET '{client_secret}',
ACCOUNT_NAME '{storage_account}'
);
""")
ddl_query = """CREATE VIEW "product_dim" AS SELECT * FROM read_json('abfss://landing@<azurestorageaccountname>.dfs.core.windows.net/entity=products_uat/year=2025/month=06/day=10/*.jsonl.gz');"""
con.sql(ddl_query)
con.sql("SELECT * FROM product_dim")
return the following error :
InvalidInputException: Invalid Input Error: Secret provider 'service_principal' not found for type 'azure'
I bypass the issue when I manually force the upgrade of duckdb (without regression).
if the version of duckdb could be upgraded, YES the `if server.format == "json" is deprecated (only s3 supported and not azure (there a PR for that, but not approved. when the duckdb version as is 1.0.0 there is no support for json on Azure storage account. the https://github.com/datacontract/datacontract-cli/pull/667 should be considered