Christopher Farrenden
Christopher Farrenden
Would be great to have this, we make use of cluster autoscaling in order to demand GPU nodes on GKE and scale down when there are no requests. Having one...
Hey @cristim I've been trying out the code from feat/event-based-instance-replacement and looks like this is not an issue. I've sent a bunch of termination events to the function and seems...
As an aside, perhaps we could catch the error if an instance no longer exists? ``` 2019/08/14 01:47:28 Triggered by EC2 Spot Instance Interruption Warning 2019/08/14 01:47:28 spot_termination.go:38: Connection to...
@cristim I don't do golang too much but maybe something like this `spot_termination.go` ``` func (s *SpotTermination) getAsgName(instanceID *string) (string, error) { asParams := autoscaling.DescribeAutoScalingInstancesInput{ InstanceIds: []*string{instanceID}, } result, err...
This would be great for us! We used to use the `latest` tag but unfortunately it broke a while back and we needed to version lock.
For anyone who is watching this, I have built the `1.0.0` release as `cfarrend/flower:v1.0.0`. I'm basing it off the same code except with one slight Dockerfile modification. If it works...
I noticed that on their README (updated 6 months ago) that they make reference to a "vanilla helm chart" that doesn't use the MinIO Operator that's linked [here](https://github.com/minio/minio/tree/master/helm/minio). Could this...
+1 Enabling this would mean almost every GitHub notification related to pull requests could be automatically enabled in Slack as the assignee (PR author) will be notified on comments in...
Logs before pausing ``` 2025-09-24T03:00:08.232Z {"message":"deleted node","commit":"4ff8cfe","controller":"node.termination","controllerGroup":"","controllerKind":"Node","Node":{"name":"ip-10-102-10-221.us-west-2.compute.internal"},"namespace":"","name":"ip-10-102-10-221.us-west-2.compute.internal","reconcileID":"8eac043e-da70-42c5-86c9-9d5bfe89a752","NodeClaim":{"name":"on-demand-q4sw5"}} 2025-09-24T03:00:08.806Z {"message":"deleted nodeclaim","commit":"4ff8cfe","controller":"nodeclaim.lifecycle","controllerGroup":"karpenter.sh","controllerKind":"NodeClaim","NodeClaim":{"name":"on-demand-q4sw5"},"namespace":"","name":"on-demand-q4sw5","reconcileID":"ea178233-732c-4448-9267-7bed3d01ba82","provider-id":"aws:///us-west-2a/i-0e8544772577810dd","Node":{"name":"ip-10-102-10-221.us-west-2.compute.internal"}} 2025-09-24T03:00:21.717Z {"message":"disrupting node(s)","commit":"4ff8cfe","controller":"disruption","namespace":"","name":"","reconcileID":"725b1c59-d9d6-4a55-a98b-505a9148db9c","command-id":"89d6df46-da57-4799-959c-ff56a1f56fad","reason":"underutilized","decision":"replace","disrupted-node-count":1,"replacement-node-count":1,"pod-count":9,"disrupted-nodes":[{"Node":{"name":"ip-10-102-13-248.us-west-2.compute.internal"},"NodeClaim":{"name":"on-demand-hwmdn"},"capacity-type":"on-demand","instance-type":"r5a.xlarge"}],"replacement-nodes":[{"capacity-type":"on-demand","instance-types":"c6a.xlarge, c5a.xlarge, c7i-flex.xlarge, c6i.xlarge, c5.xlarge and 15 other(s)"}]} 2025-09-24T03:00:21.764Z {"message":"created nodeclaim","commit":"4ff8cfe","controller":"disruption","namespace":"","name":"","reconcileID":"725b1c59-d9d6-4a55-a98b-505a9148db9c","NodePool":{"name":"on-demand"},"NodeClaim":{"name":"on-demand-xw599"},"requests":{"cpu":"3180m","memory":"3144Mi","pods":"9"},"instance-types":"c5.xlarge, c5a.xlarge, c5ad.xlarge, c5d.xlarge, c5n.xlarge and 15 other(s)"}...
Logs after pausing ``` 2025-09-24T03:47:00.180Z {"message":"disrupting node(s)","commit":"4ff8cfe","controller":"disruption","namespace":"","name":"","reconcileID":"124825f6-dc67-4b92-a93e-5f8e2cf7d876","command-id":"f35a6d53-7090-493d-a81d-69f2273e461d","reason":"underutilized","decision":"replace","disrupted-node-count":1,"replacement-node-count":1,"pod-count":4,"disrupted-nodes":[{"Node":{"name":"ip-10-102-39-59.us-west-2.compute.internal"},"NodeClaim":{"name":"on-demand-g8lmr"},"capacity-type":"on-demand","instance-type":"c6a.4xlarge"}],"replacement-nodes":[{"capacity-type":"on-demand","instance-types":"c6a.2xlarge, c5a.2xlarge, c7i-flex.2xlarge, c5.2xlarge, c6i.2xlarge and 35 other(s)"}]} 2025-09-24T03:47:00.242Z {"message":"created nodeclaim","commit":"4ff8cfe","controller":"disruption","namespace":"","name":"","reconcileID":"124825f6-dc67-4b92-a93e-5f8e2cf7d876","NodePool":{"name":"on-demand"},"NodeClaim":{"name":"on-demand-2zjbf"},"requests":{"cpu":"5680m","memory":"10520Mi","pods":"10"},"instance-types":"c5.2xlarge, c5a.2xlarge, c5ad.2xlarge, c5d.2xlarge, c5n.2xlarge and 35 other(s)"} 2025-09-24T03:47:02.610Z {"message":"launched nodeclaim","commit":"4ff8cfe","controller":"nodeclaim.lifecycle","controllerGroup":"karpenter.sh","controllerKind":"NodeClaim","NodeClaim":{"name":"on-demand-2zjbf"},"namespace":"","name":"on-demand-2zjbf","reconcileID":"2e770ca8-0bb5-470b-9203-f46db9e67365","provider-id":"aws:///us-west-2c/i-06ab2a324325df78c","instance-type":"c6a.2xlarge","zone":"us-west-2c","capacity-type":"on-demand","allocatable":{"cpu":"7910m","ephemeral-storage":"89Gi","memory":"14968076Ki","pods":"58","vpc.amazonaws.com/pod-eni":"38"}} 2025-09-24T03:47:24.439Z {"message":"registered nodeclaim","commit":"4ff8cfe","controller":"nodeclaim.lifecycle","controllerGroup":"karpenter.sh","controllerKind":"NodeClaim","NodeClaim":{"name":"on-demand-2zjbf"},"namespace":"","name":"on-demand-2zjbf","reconcileID":"1684b7ad-5e9c-4e91-961a-7a7556df6085","provider-id":"aws:///us-west-2c/i-06ab2a324325df78c","Node":{"name":"ip-10-102-65-222.us-west-2.compute.internal"}}...