oci-cloudnative icon indicating copy to clipboard operation
oci-cloudnative copied to clipboard

Destroy job fails for mushop stack

Open ddevadat opened this issue 3 years ago • 7 comments

i deployed the complete stack of mushop using https://cloud.oracle.com/resourcemanager/stacks/create?zipUrl=https://github.com/oracle-quickstart/oci-cloudnative/releases/latest/download/mushop-stack-latest.zip

Few things i noticed, once i apply the stack successfully, if i run plan operation i get errors like

Error: Get "http://localhost/apis/rbac.authorization.k8s.io/v1/namespaces/kube-system/rolebindings/cluster-autoscaler": dial tcp [::1]:80: connect: connection refused

Then i tried to destroy the stack by disabling the refresh option, but the job failed with the error

oci_objectstorage_bucket.mushop_catalogue_bucket[0]: Destruction complete after 2s
oci_identity_group.oci_service_user[0]: Destruction complete after 4s
helm_release.cert_manager[0]: Destruction complete after 2s
Error: Failed to delete Job! API error: jobs.batch "wallet-extractor-job" not found

i downloaded the terraform state file , removed the entry for wallet-extractor-job, imported the state file back and then re executed the destroy job.

is there a better way to do this or did i miss something

ddevadat avatar Feb 22 '22 13:02 ddevadat

That's an issue with the Terraform Kubernetes Provider working with the Terraform OCI provider on the OCI Resource Manager.

You need to set the refresh=false.

If using local terraform, you can destroy by using terraform destroy -refresh=false.

If using OCI Resource Manager, when destroying, open the "Show Advanced Options" and uncheck the refresh resources, like shown on the image bellow: image

junior avatar Feb 24 '22 17:02 junior

i did that by disabling refresh,,but the destroy fails

oci_objectstorage_bucket.mushop_catalogue_bucket[0]: Destruction complete after 2s oci_identity_group.oci_service_user[0]: Destruction complete after 4s helm_release.cert_manager[0]: Destruction complete after 2s Error: Failed to delete Job! API error: jobs.batch "wallet-extractor-job" not found

ddevadat avatar Feb 25 '22 04:02 ddevadat

i did that by disabling refresh,,but the destroy fails

oci_objectstorage_bucket.mushop_catalogue_bucket[0]: Destruction complete after 2s oci_identity_group.oci_service_user[0]: Destruction complete after 4s helm_release.cert_manager[0]: Destruction complete after 2s Error: Failed to delete Job! API error: jobs.batch "wallet-extractor-job" not found

The job is removed on newer versions of Kubernetes finally as expected. But because the Terraform is not refresh do not get that state. This probably only happens with kubernetes 1.21+ as the TTL for jobs is out of alpha.

Because of that, I need to update the TF scripts to handle the lifecycle. Will be a new release.

junior avatar Feb 25 '22 18:02 junior

@ddevadat The latest stack have improvements for this. Did you tested?

junior avatar Mar 01 '22 00:03 junior

i did use the latest one from https://github.com/oracle-quickstart/oci-cloudnative/releases but it was a week ago.

is this the latest one https://github.com/oracle-quickstart/oci-cloudnative/releases/latest/download/mushop-stack-v3.1.1.zip ?

ddevadat avatar Mar 01 '22 08:03 ddevadat

Yes, that's the latest one. No issues destroying (when set the refresh to false).

junior avatar Mar 01 '22 16:03 junior

Got the same issue with https://github.com/oracle-quickstart/oci-cloudnative/releases/latest/download/mushop-stack-v3.1.1.zip

Error: Failed to delete Job! API error: jobs.batch "wallet-extractor-job" not found

I believe this job doesnt exist after it has done its job, but it exists in the state file. If refresh is disabled during destroy, i theory terraform will try to delete this job as its there in the state file.

ddevadat avatar Mar 03 '22 03:03 ddevadat

Issue fixed with the Stack version 3.2.0. All updated and tested up do OKE 1.29

junior avatar Mar 31 '24 04:03 junior