Windows based Databricks CLI does not parse JSON correctly when trying to run a notebook JOB
It seems that when trying to run a notebook JOB in Azure Databricks with custom parameters, passed in from the Databricks CLI as a JSON string, while using a Windows command line, the parsing of the JSON fails, throwing an error like the one below:
C:\Users\radu.gheorghiu>databricks jobs run-now --job-id 2969 --notebook-params '{"system_id":"991", "as_of_date":"2020-05-11", "from_date":"2020-05-01", "to_date":"2020-05-07"}'
Usage: databricks jobs run-now [OPTIONS]
Try 'databricks jobs run-now -h' for help.
Error: Got unexpected extra arguments (as_of_date:2020-05-11, from_date:2020-05-01, to_date:2020-05-07}')
The same command line call works fine in a UNIX command line, however it fails in the Windows command line with the above error.
I'm using the latest version of Databricks CLI:
C:\Users\radu.gheorghiu>databricks -v
Version 0.10.0
I'm facing the same issue, databricks cli 0.9.1 on windows.
The parameters from the help {"name": "john doe", "age": 35} don't work.
The only parameters that are accepted are an empty map: {}.
I'm facing the same issue, databricks cli 0.9.1 on windows.
The parameters from the help
{"name": "john doe", "age": 35}don't work.The only parameters that are accepted are an empty map:
{}.
Exactly, that's the only scenario that works for me as well. However, this isn't really how the parameters for the jobs are intended to work. As a workaround I've used the API, for now, to pass parameters and run jobs, until this is fixed.
Same issue here, this time parsing a job description with cron expression which must have spaces. Reproducible example tested in Azure Databricks:
> databricks jobs create --json '"{\"name\":\"Nightly_model_training\",\"new_cluster\":{\"spark_version\":\"7.3.x-scala2.12\",\"node_type_id\":\"Standard_DS12_v2\",\"num_workers\":1},\"libraries\":[{\"jar\":\"dbfs:/my-jar.jar\"},{\"maven\":{\"coordinates\":\"org.jsoup:jsoup:1.7.2\"}}],\"max_retries\":1,\"spark_jar_task\":{\"main_class_name\":\"com.databricks.ComputeModels\"}}"'
{
"job_id": 12
}
> databricks jobs delete --job-id 12
> databricks jobs create --json '"{\"name\":\"Nightly_model_training\",\"new_cluster\":{\"spark_version\":\"7.3.x-scala2.12\",\"node_type_id\":\"Standard_DS12_v2\",\"num_workers\":1},\"libraries\":[{\"jar\":\"dbfs:/my-jar.jar\"},{\"maven\":{\"coordinates\":\"org.jsoup:jsoup:1.7.2\"}}],\"max_retries\":1,\"schedule\":{\"quartz_cron_expression\":\"0 15 22 ? * *\",\"timezone_id\":\"America/Los_Angeles\"},\"spark_jar_task\":{\"main_class_name\":\"com.databricks.ComputeModels\"}}"'
Usage: databricks jobs create [OPTIONS]
Try 'databricks jobs create -h' for help.
Error: Got unexpected extra arguments (15 22 ? * *","timezone_id":"America/Los_Angeles"},"spark_jar_task":{"main_class_name":"com.databricks.ComputeModels"}})
Using Anaconda's Prompt in Powershell in Windows 10. Versions:
> conda --version
conda 4.9.2
> $PSVersionTable.PSVersion
Major Minor Build Revision
----- ----- ----- --------
5 1 19041 906
> databricks -v
Version 0.14.0
My workaround is to use WSL together with PowerShell Core and it works. (The deployment script has extra steps in PowerShell)
I had the same symptoms as well. In my case the fix was to save the JSON to file as ANSI (or UTF-8) encoding then use "databricks jobs create" on the resulting file (with edits, see below). Note it was UTF-8 when the error occurred which should have worked but did not. The step of saving to file with the specified encoding may strip out something it does not like.
In Powershell:
$settings = databricks jobs get --job-id 123456
$settings | out-file -encoding ASCII <filename>.json
# Note you have to edit the output JSON file to remove "job_id" and "settings" nodes and their corresponding level (outer brackets).
databricks jobs create --json-file <filename>.json
Hope this helps someone!