Add-DatabricksCluster - need to pass single_user_name parameter
On Azure you can create a standard cluster with single user for passthrough authentication.
I am not able to pass this single_user_name. to Add-DatabricksCluster.
I a dummy cluster manually using UI and used Get-DatabricksCluster to show that field below in bold.
{ "cluster_id": "0406-193135-bikjc80o", "cluster_name": "mycluster-ml", "spark_version": "10.4.x-cpu-ml-scala2.12", "spark_conf": { "spark.databricks.delta.preview.enabled": "true", "spark.databricks.passthrough.enabled": "true", "spark.sql.session.timeZone": "America/Chicago" }, "node_type_id": "Standard_E16ds_v4", "driver_node_type_id": "Standard_E16ds_v4", "custom_tags": { "ResourceClass": "Standard" }, "spark_env_vars": { "PYSPARK_PYTHON": "/databricks/python3/bin/python3" }, "autotermination_minutes": 45, "enable_elastic_disk": true, "disk_spec": {}, "cluster_source": "API", "single_user_name": "[email protected]", "enable_local_disk_encryption": false, "azure_attributes": { "first_on_demand": 1, "availability": "ON_DEMAND_AZURE", "spot_bid_max_price": -1.0 }, "instance_source": { "node_type_id": "Standard_E16ds_v4" }, "driver_instance_source": { "node_type_id": "Standard_E16ds_v4" }, "effective_spark_version": "10.4.x-cpu-ml-scala2.12", "state": "PENDING", "state_message": "Finding instances for new nodes, acquiring more instances if necessary", "start_time": 1649273495838, "last_state_loss_time": 0, "last_restarted_time": 1649273495838, "autoscale": { "min_workers": 4, "max_workers": 10 }, "default_tags": { "Vendor": "Databricks", "Creator": "[email protected]", "ClusterName": "mycluster-ml", "ClusterId": "0406-193135-bikjc80o", "application": "workspace", "environment": "dataengineering", "product": "data-platform" }, "creator_user_name": "[email protected]", "init_scripts_safe_mode": false }
you are right, there is no dedicated parameter for this but I think it also does not make sense to add one for each property that may exist or may be added in the future
alternatively, if you have those more complex cluster requirements, I would recommend to define the cluster as a dictionary first and pass it to the Add-DatabricksCluster as cluster object or via pipelining:
$my_new_cluster = @{
"cluster_name" = "my new cluster"
"num_workers" = 2
...
"single_user_name" = "[email protected]"
}
$my_new_cluster | Add-DatabricksCluster
this should work for all properties
Thank you. What you suggested works. However, I noticed that if put ordered in front of the hashtable then code hangs on line 112 in file clustersAPI.ps1 as shown below. Not sure why this happens.
$my_new_cluster = [Ordered]@{ "cluster_name" = "my new cluster" "num_workers" = 2 ... "single_user_name" = @.***" }
Here is the line
$parameters = $ClusterObject | ConvertTo-Hashtable
On Thu, Apr 7, 2022 at 3:29 AM Gerhard Brueckl @.***> wrote:
you are right, there is no dedicated parameter for this but I think it also does not make sense to add one for each property that may exist or may be added in the future
alternatively, if you have those more complex cluster requirements, I would recommend to define the cluster as a dictionary first and pass it to the Add-DatabricksCluster as cluster object or via pipelining:
$my_new_cluster = @{ "cluster_name" = "my new cluster" "num_workers" = 2 ... "single_user_name" = @.***" }
$my_new_cluster | Add-DatabricksCluster
this should work for all properties
— Reply to this email directly, view it on GitHub https://github.com/gbrueckl/Databricks.API.PowerShell/issues/51#issuecomment-1091306363, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD36TTLIA3UQL2OYMWSFFD3VD2MHVANCNFSM5SXDL6JQ . You are receiving this because you authored the thread.Message ID: @.***>
-- Sincerely,
also fixed this in v1.9.9.7
is this still an issue or can we close this?
Sorry for the delay on testing this.
I still see an error generated when you pass it an ordered hash table as shown below. If you take out "[ordered]" then it works:
$my_new_cluster = [ordered]@{ "cluster_name" = "aldrous_spark" "cluster_mode" = "Standard" #"min_workers" = 1 #"max_workers" = 2 num_workers = 2 "spark_version" = "10.4.x-cpu-ml-scala2.12" "node_type_id" = "Standard_E16ds_v4" "driver_node_type_id" = "Standard_E16ds_v4" "autotermination_minutes" = 20
}
$my_new_cluster | Add-DatabricksCluster -Verbose
Line | 160 | … arameters | Add-Property -Name "cluster_name" -Value $ClusterName -Fo … | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | The input object cannot be bound to any parameters for the command either because the command does not take pipeline input or the input and its properties do | not match any of the parameters that take pipeline input.
just published v1.9.9.17 which should finally fix this issue
I just tested this in 1.9.9.17 and it didn't work for me. Having an ordered hash table generates an error. If you take our "[ordered]" then it works.
Here is example code:
$my_new_cluster = [ordered]@{ "cluster_name" = "my_new_cluster" "num_workers" = 2 "spark_version" = "10.4.x-cpu-ml-scala2.12" "node_type_id" = "Standard_E16ds_v4" }
$my_new_cluster | Add-DatabricksCluster
dd-Property: /Users/saldroubi/.local/share/powershell/Modules/DatabricksPS/1.9.9.14/Public/ClustersAPI.ps1:160:19 Line | 160 | … arameters | Add-Property -Name "cluster_name" -Value $ClusterName -Fo … | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | The input object cannot be bound to any parameters for the command either because the command does not take pipeline input or the input and its properties do not match | any of the parameters that take pipeline input.
just ran your exact code and it is working on my machine, both on PS and PS Core can you please check again?
seems like I did not publish the fix with 1.9.9.17 will do with 1.9.9.18 later today
Ok, I'll test it again next week. Thank you.