Configuration via annotations
Resolves #44.
Rule configuration may be overridden by annotations on individual pods. For single-value rules, the configured rule value will be replaced by the annotation value. For multi-value rules, annotations will be added to the configured rule values. See Implemented Rules for available annotations.
For backwards compatibility and minimum possible side effects, I opted to load configured rules as usual. I added a RULES environment variable to load rules explicitly without a configuration. See RULES. I'm open to other ideas for how to load rules, but this was the quickest way forward. There was some discussion in https://github.com/target/pod-reaper/issues/44#issuecomment-585992088 on this.
Example:
This configuration will load the Unready rule with 5m duration and the Duration rule with no configuration. Only pods decorated with pod-reaper/max-duration will be reaped by the Duration rule.
MAX_UNREADY=5m
RULES=duration,unready
I'm looking at #45 and wondering why I didn't merge that... I think it was from lack of a code review. That change is definitely less simple, but after it's made... I think there could be a really clean way forward with this. It'd be an alternate approach (basically skipping loading all together).
Every way I look at this I see crazy edge cases, which makes me really apprehensive. But I think I'm going to take a stab at what it might look like off of #45 (I try stuff like this a lot, and most of it I just throw out!)
I like where #45 was going. I had hoped to start with it, but I had an immediate need, so I started with master. I'd be happy to refactor this.
It would be nice to merge #50 as well. I think automated end-to-end tests would really increase the confidence when merging large changes.
pushed up a container image: brianberzins/pod-reaper:alpha-annotation-0121-1
This was based off #45 and has an environment variable called HONOR_ANNOTATIONS that, when set to true (or anything https://golang.org/pkg/strconv/#ParseBool accepts as true) will attempt to honor annotations.
I have some higher level functional tests that I still need to run, but I might not be able to get to that tonight.
The HONOR_ANNOTATIONS setting is unnecessary. Any rule loaded with a default will use the default if the pod is not annotated. In other words, behavior will be exactly the same as it is today.
I think I figured out why I'm having a hard time with this one... Not from a difficulty to implement perspective, but from a product perspective: the complexity of this feature is higher than anything I every thought pod-reaper would do...
Each time I try to implement this, I end up with something that works for most cases, but there always seems to be one that doesn't fit.
- What does this change mean for someone that is managing the pod-reaper?
- What does this change mean for someone managing a pod that is within the sight of the reaper?
- What if those people are the same, different, or completely unaware of each-other?
If pod annotations aren't opt-in, then the manager of the reaper won't know (outside of logs and a lot of manual comparing) what's being overridden. This would be true for the very first use case of pod-reaper. It was originally designed to cleanup CI jobs running on a shared cluster when they were taking way to long or got stuck in an crash loop. In that case, we did not want anyone to every be able to override it.
But I think there's a more fundamental disagreement with allowing annotations to override. It just splits who's responsible for what. In some cases, the configuration is annotations, the other it's in the environment of the reaper. So when we're using annotations, there are the following things that have to considered for at least 1 (but probably more) rules:
- the configuration of the reaper
- the configuration of the pod
- which one should take precedence (because there are use cases for both)
- whether or not the pod actually meets the criteria.
I guess, in my head, this is the feature that crossed the line of "simple" to "complex" -- which brings me to another... not great point. It very much looks like I'm the only one maintaining this... and I honestly don't use it my day to day anymore. Changes in jobs has me predominantly working with teams that are behind on the kubernetes adoption train. That's leaving me feeling more and more like I don't have "skin in this game" and shouldn't be making the product level decisions anyway :(
Even though annotations are always checked, I would argue that they are still opt-in.
- Pod owner opts in by annotation a pod.
- pod-reaper owner opts in by specifying a rule in
RULESto load without default. - If neither 1 or 2, pod-reaper behavior is the same as is today.
The case where the pod and pod-reaper owners are different and the pod-reaper owner does not want to allow override is interesting. If you are using pod reaper with multiple rules activated, then an annotation could allow the pod owner to evade getting killed. In my opinion, the pod owner should get the final say, but I guess that's ultimately a decision for the product owner.
We are running our own fork of pod-reaper with the annotations already compiled in. If you want to leave this open until someone else adopts it, you're not holding us back. If the project is no longer actively being maintained, we may continue adding features in our fork without pushing them back upstream.