terraform-provider-stackit Changing SKE NodePool from eu01-1 -> eu01-m fails

How to reproduce:

Create SKE-Cluster, change NodePool availability_zone eu01-1 and change it to eu01-m. You will get an error: changing the order of AvailabilityZones it not allowed

Expected behaviour:

Node Pool gets replaced

May 31 '24 10:05 roberth1988

Hey Robert,

Thank you for reporting this issue and in general for being active in doing so!

I'll be investigating this problem and will provide an update soon.

May 31 '24 15:05 DiogoFerrao

Hey @roberth1988, I have looked into it and it seems to be a case for adding a RequiresReplace modifier to the attribute.

Unless I'm missing something, I don't think we can replace the individual NodePool with the SKE API, so we would need to replace the entire cluster.

I will create a PR for this today, but if you have more information on the Node Pool replacement please reach out!

Jun 03 '24 08:06 DiogoFerrao

Hi @DiogoFerrao I don't think replacing the whole cluster is the way. Please check out for example Azure Terraform / AKS.

They have a temporary_name_for_rotation... Ref: https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/kubernetes_cluster

Changing certain properties of the default_node_pool is done by cycling the system node pool of the cluster. 
When cycling the system node pool, it doesn't perform cordon and drain, and it will disrupt rescheduling pods currently running on the previous system node pool.temporary_name_for_rotation must be specified when changing any of the following properties: enable_host_encryption, enable_node_public_ip, fips_enabled, kubelet_config, linux_os_config, max_pods, only_critical_addons_enabled, os_disk_size_gb, os_disk_type, os_sku, pod_subnet_id, snapshot_id, ultra_ssd_enabled, vnet_subnet_id, vm_size, zones.

So they rotate the Node-Pool. That's something you could do too ... or by default add a suffix on the node pool name(s) So you can replace them by creating first a new one ... then deleting the old one.

Replacing the cluster in such a case is a too huge interruption.

Jun 03 '24 10:06 roberth1988

You have a good point there and thank you for the suggestion. I have looked at the implementation in the Azure provider, and it seems that it will work with SKE as well.

In general, it has some issues:

It breaks volumes that may be attached to the nodes (since they are region specific)
It may break node pool name specific configuration, such as kubernetes scheduling contraints or others

I tend to agree that having this is better than throwing an error, as long as its properly documented and warnings are shown. We will investigate this further and work on a solution for the problem.

In the meantime, I think we can temporarily replace the entire cluster with the plan modifier, which at least allows users to perform this operation. Once we implement the Node Pool rotation we would remove this.

Jun 03 '24 16:06 DiogoFerrao

Hi @DiogoFerrao you break even more when replacing the whole cluster when we think about volumes. When a SKE instance is replaced, it looses also completely the knowledge / information about PVC, so you need to create them manually. Further you loose all ConfigMaps, Secrets etc. etc.

So replacing on a NodePool change the whole cluster is even more harmful in the way of how Kubernetes should work. I heavily recommend moving on to implement the rotation logic, so that's more how cloud works :)

Jun 03 '24 18:06 roberth1988

I completely agree @roberth1988, I was suggesting implementing a stop gap solution while we plan the implementation of your recomendation in our backlog.

However, you raised important concerns about this temporary approach, so we will leave it with the error for the time being and report back to this issue when we move to implement a better solution, following your suggestion.

Jun 03 '24 22:06 DiogoFerrao