terraform-provider-databricks icon indicating copy to clipboard operation
terraform-provider-databricks copied to clipboard

[ISSUE] Issue with `databricks_metastore_data_access` resource. `is_default` is forcing replace on every `apply`

Open amwill04 opened this issue 1 year ago • 6 comments

Configuration

resource "databricks_metastore_data_access" "this" {
  metastore_id = databricks_metastore.this.id
  name         = aws_iam_role.metastore_data_access.name
  aws_iam_role {
    role_arn = aws_iam_role.metastore_data_access.arn
  }
  is_default = true
}

Expected Behavior

Running terraform apply multiple times does not end up with the field value is_default causing the resource to be replaced

Actual Behavior

The fact that the field storage_root_credentials_name is not being returned the following always resolves to false here

Steps to Reproduce

run `terraform apply multiple times even without any changes

Terraform and provider versions

terraform v1.9.5 databricks 1.9.3

Is it a regression?

Seems to have suddenly started . However this could potentially have been caused by changes in the sdk that this uses?

amwill04 avatar Sep 27 '24 15:09 amwill04

Further to note - I could not replicate this on a newly created metastore within a different AWS region. This metastore was created over a year ago

amwill04 avatar Sep 27 '24 15:09 amwill04

+1. I have the same issue. My last successful run was on 9/23 possibly with databricks v1.51.0(don't have logs to confirm it). When I am trying today, it is failing with this issue and the provider version is databricks v1.52.0, as we use the latest version in our job.

I tried to run by restricting the versions to 1.51 and 1.50, just to see if rolling back to a older version could be a work around. But no luck. Could you please fix this sooner?

vsluc avatar Sep 27 '24 18:09 vsluc

@vsluc simply add

  lifecycle {
    ignore_changes = [
      is_default
    ]
  }

And it will resolve it. I spent a day to discover that the is_default flag is somewhat meaningless in Databricks. From what I am guessing its there if you have more than one access control per metastore. But not really sure why you would have that. So to that end I only have the one and the above solves the issue. Or a worksaround at the very least. Allowed us to at least deploy again. Even went through the hassle of manually creating and moving the credentials to reimport them to no avail.

I dont actually think the issue is with this terraform provider. If you List all metastores on the account then the metastore will have the required field storage_root_credential_name. However if you then do a GET on that specific metastore (which is whats happening here) then it is missing. Something seems to have changed with Databricks itself? But not sure past that

amwill04 avatar Sep 27 '24 21:09 amwill04

@vsluc simply add

  lifecycle {
    ignore_changes = [
      is_default
    ]
  }

And it will resolve it. I spent a day to discover that the is_default flag is somewhat meaningless in Databricks. From what I am guessing its there if you have more than one access control per metastore. But not really sure why you would have that. So to that end I only have the one and the above solves the issue. Or a worksaround at the very least. Allowed us to at least deploy again. Even went through the hassle of manually creating and moving the credentials to reimport them to no avail.

I dont actually think the issue is with this terraform provider. If you List all metastores on the account then the metastore will have the required field storage_root_credential_name. However if you then do a GET on that specific metastore (which is whats happening here) then it is missing. Something seems to have changed with Databricks itself? But not sure past that

Makes sense and this is a good idea. I will try this work around. Thanks @amwill04

vsluc avatar Sep 27 '24 21:09 vsluc

I have the same problem on Azure.

Databricks TF Provider Version 1.52. There were no changes to the provider version, or within the TF Code, or directly on Databricks resources themselves.

Last successful run: September 26, 2024 at 9:03:04 AM GMT+2 First failing run: September 27, 2024 at 3:00:52 AM GMT+2

However, we have "stages" on the same metastore, so we create two databricks_metastore_data_access resources (but with different names).

The first one is created when deploying to the "preprod"-stage. This is NOT affected. The second one is deployed in the "prod" stage (which follows after a successful deployment on preprod), this one produces the error as mentioned.

Note: the deployment to both stages uses the exact same logic. The difference is only in a variable "stage" set to preprod or prod.

resource "databricks_metastore_data_access" "data_access_storage_credential" { provider = databricks.azure_account

depends_on = [databricks_metastore.metastore] metastore_id = databricks_metastore.metastore.id name = format("%s_default2_%s_sc", local.ms_name, var.stage) azure_managed_identity { access_connector_id = azapi_resource.access_connector[var.stage].id } is_default = true }

HansjoergW avatar Sep 30 '24 08:09 HansjoergW

We run into this without doing a provider upgrade or code changes, i.e problem appear without any changes (and by using lockfiles we can know the source code was the same). My bet is either Databricks rolled out a new API version that they thought was backwards compatible, or this provider has introduced a slight bug which makes it not idempotent.

Terraform version: 1.8.5 Provider version: 1.52.0

apamildner avatar Oct 07 '24 14:10 apamildner