Limit Safemode List Digest
What does this PR do?
- [ ] Adds new functionality
- [ ] Alters existing functionality
- [X] Fixes a bug
- [ ] Improves documentation or testing
Please briefly describe your changes as well as the motivation behind them:
- We found that the safemode feature which checks cluster information to determine blast radius was taking up a lot of memory and OOMing pods. This PR is a potential fix to this, by limiting how many objects are digested at a time, hopefully not filling the memory up.
In regards the branch name consisting of etcd - This was the original idea - to use etcd for the object count look up, but this fell through and now we are using the limiting functionality of the client api.
Code Quality Checklist
- [X] The documentation is up to date.
- [X] My code is sufficiently commented and passes continuous integration checks.
- [X] I have signed my commit (see Contributing Docs).
Testing
- From the initial sight of this bug, the injector pods were OOMing/timing out. I tested this branch in an environment with 10k+ pods and no OOMing or Timing out was occurring.
I had hoped there was a way to query the API for just the total count, not a full list, but seems thats lacking
From my understanding and research, there is a way to query, but it requires specific configuration on a kubernetes environment which everyone may not have. Also those who do have it configured may have it locked up behind administrator permissions. So it seems like a safe bet to use the generic client that should be generally accessible.