ResourceModules icon indicating copy to clipboard operation
ResourceModules copied to clipboard

[Bug Report]: Dependencies pipeline fails on Disk Encryption Sets job

Open eriqua opened this issue 3 years ago • 8 comments

Describe the bug

The GH dependencies pipeline is failing on the job creating disk encryption set. The resource is leveraged by the AKS module. The job is not able to set access policies on the key vault used for the CMK. The key vault is created and set up in a previous job of the same pipeline. It looks like a timing issue, considering the disk encryption set module pipeline is succeeding.

  1. Further inspect the cause of the issue and fix.
  2. Once fixed, the job needs to be added to the ADO dependencies pipeline as well.

To reproduce

Run the GitHub dependencies workflow

Code snippet

No response

Relevant log output

Unable to access key vault resource 'https://adp-abcde-az-kv-x-001.vault.azure.net/keys/keyEncryptionKey/abcdefghijklmnopqrstuvwxyz' to enable encryption at rest. Please grant get, wrap and unwrap key permissions to disk encryption set 'adp-abcde-az-des-x-001'. Please visit https://aka.ms/keyvaultaccessssecmk for more information. (Code:KeyVaultAccessForbidden)

eriqua avatar Jul 31 '22 22:07 eriqua

After some discovery work on a clean slate environment. I observed the following behaviour:

  • Deleting the dependency DES and redeploying it again is successful. As it is able to perform the 'KV access policy deployment' once the DES deployment completes.
  • However, triggering the dependency pipeline, and when the 'key vault' job runs first. The access policy array does not have the DES access policy requirement; hence it will delete the existing one created in the previous deployment.
  • Now seems that there is an issue with DES where if you revoke the permissions and then attempt to rerun the deployment again. The DES does not function anymore, and the DES template fails, which doesn't even allow it to get to the KV access policy nested deployment.

So what is the solution options?

  1. DES dependency moves into a new Key vault that uses AAD role assignments instead of access policy, so we need to work on the DES module as we are only supporting the model that uses KV access policy. I am not sure yet if RBAC is supported for DES and KV so we need to validate how we can provide both experiences for the DES module.

  2. we move the module that uses this dependency DES into the new dependency approach to get rid of the issue. And retire this dependency problem.

  3. Modify the dependency pipeline and figure out how we can deploy the key vault access policies outside of the KV module. Although passing an empty array to the KV removes existing access policies. Probably least favourable option given the uncertainties and the complexity of introducing more into the dependency pipeline that are trying to reduce.

CC @MrMCake @eriqua @MariusStorhaug

ahmadabdalla avatar Aug 23 '22 08:08 ahmadabdalla

Thanks @ahmadabdalla for this detailed analysis!

I confirm the dependencies pipeline completes successfully after deleting the des dependency. This means that the problem does not occur the first time users will deploy the dependencies pipeline, but only on next runs. This also means that the issue won't occur once moving to the new dependencies approach. I agree then that it wouldn't be worth the effort to fix the issue since we're planning to move to the new dependencies approach anyway.

So I support option 2.

However, since the new dependency approach will not be included into the next release, I'd suggest to briefly describe the problem and related workaround in the known issues wiki page.

eriqua avatar Sep 02 '22 01:09 eriqua

Thanks @eriqua! I was also going to do some research in seeing if we can use rbac instead of access policies for a DES managed identity with Key vault and test it out. If that works, maybe we'll solve this issue https://github.com/Azure/ResourceModules/issues/1421 as well.

ahmadabdalla avatar Sep 05 '22 10:09 ahmadabdalla

Sounds good. For the upcoming release I'm going to add a note in the Known Issues section of the Wiki for users downloading the release and stil using the current dependencies approach. Opening a separated issue #2022 for that.

eriqua avatar Sep 13 '22 19:09 eriqua

The bug and related workaround have been added to the known issues wiki page as per issue #2022 . Moving this task out of release 0.7 milestone.

eriqua avatar Sep 17 '22 17:09 eriqua

Thanks @eriqua . I've uplifted the DES module yesterday to allow for KV RBAC. And I'm doing some tests. It's not looking good as far as the DES still complains about not having access to KV and can't use features like Auto rotate keys unless I explicitly click on the error on the Portal . I also couldn't find documentation that references this being a supported pattern. Can you check this comment please regarding the module uplift: https://github.com/Azure/ResourceModules/issues/1421#issuecomment-1250146883

@MrMCake fyi

ahmadabdalla avatar Sep 17 '22 21:09 ahmadabdalla

@eriqua do you think we should already remove this implementation from the dependency pipeline, now that the ComputeImage module leverages the new dependencies approach?

AlexanderSehr avatar Oct 08 '22 17:10 AlexanderSehr

@MrMCake I believe DES is only used by AKS, which is also merged. We could, although removing the entire dependencies pipeline once the conversion will be concluded would be easier than continuously updating it. I didn't test it, but I'd expect this same problem would occur with the new dependencies approach as well if we skipped the removal and redeploy, wouldn't it?

eriqua avatar Oct 10 '22 22:10 eriqua

@ahmadabdalla @AlexanderSehr, the cleanup of the old dependencies approach will be done as part of issue #2291. I removed the high-prio label from this issue, but wondering: should we close it as not planned or is there work left on this?

eriqua avatar Nov 07 '22 22:11 eriqua

@eriqua i don't believe any work needs to happen on this issue if we've remediated it via the dependency approach and the cleanup you mentioned. So I vote to close it

ahmadabdalla avatar Nov 07 '22 23:11 ahmadabdalla

Closing issue as the dependency pipeline is removed in main and will remain removed for the upcoming release

AlexanderSehr avatar Nov 27 '22 10:11 AlexanderSehr