mrog comments

Results 20 comments of


                                            mrog

While creating a new cluster, CAPI fails to remediate new machines that aren't functional

> might be I'm missing something from the issue description, but it seems to me that what you are describing can be achieved by MHC's nodeStartupTimeout, am I wrong? Even...

While creating a new cluster, CAPI fails to remediate new machines that aren't functional

I reproduced the issue again using unmodified versions of EKS-A and all the k8s tools. The management cluster is named `mark-mgmt`, and the workload cluster that's being created is named...

While creating a new cluster, CAPI fails to remediate new machines that aren't functional

Here's another example, again using unmodified code. The setup is the same as the last one, except this time I disabled the first CP machine as it was being added...

While creating a new cluster, CAPI fails to remediate new machines that aren't functional

In both scenarios, the MHC is only attached to the management cluster. MHC was only attached to the management cluster. In the scenario where I disabled one of the MD...

While creating a new cluster, CAPI fails to remediate new machines that aren't functional

In my last comment, I said that the CAPI controller and CRDs weren't installed on the workload cluster at that point in time. That suggests that they would be installed...

While creating a new cluster, CAPI fails to remediate new machines that aren't functional

Thanks for the clarification. MHC on the management cluster isn't able to connect to the workload cluster until Cilium and kube-vip are added to it, and that doesn't happen until...

While creating a new cluster, CAPI fails to remediate new machines that aren't functional

The conversation in https://github.com/kubernetes-sigs/cluster-api/issues/1205 suggests that remediation should be done by CAPI, and I agree with that. My own experiments showed that remediation outside of CAPI (by deleting the CAPI...

While creating a new cluster, CAPI fails to remediate new machines that aren't functional

I tried another experiment. This time, I added MHC and CNI after the first CP machine was running, and before the other machines were provisioned. Then I disabled the network...

While creating a new cluster, CAPI fails to remediate new machines that aren't functional

I discovered a mistake in my experiment 2 days ago. The EKS-A changes I made weren't always working, so the CNI might not have been installed in time. I fixed...

While creating a new cluster, CAPI fails to remediate new machines that aren't functional

This looks like a combination of at least two issues. I can partially fix it in EKS-A by adding the machine health checks and CNI earlier in the cluster creation...