controller-runtime Add a new metrics to expose local cached items size

Backgroud In controller-runtime, the default behaviour is to cache all data using client.Get unless using a read-only client or custom cache policies, e.g., ClientDisableCacheFor, NewCache. Therefore, using client.Get on a specific pod in a large cluster with many pods may lead to high memory usage.

Proposal Add a new metric to expose the local cached items' size. Then we can analyze the high memory usage caused by the misuse cache policy.

Mar 28 '25 03:03 halfcrazy

#3054

Apr 01 '25 01:04 halfcrazy

@halfcrazy, looks like this will be done by https://github.com/kubernetes-sigs/controller-runtime/issues/3202

May 09 '25 09:05 krisztianfekete

@halfcrazy, looks like this will be done by #3202

I'm afraid #3202 cannot resolve this issue. AFAIK, until https://github.com/kubernetes/kubernetes/pull/129160 is merged. We need to register the informer metric too.

May 09 '25 10:05 halfcrazy

/cc @sbueringer

May 09 '25 11:05 xigang

Yeah, we're going to wait for this to be implemented in k/k. Then we can probably pick it up (if it's safe from a metric cardinality point of view)

May 09 '25 11:05 sbueringer

Thanks @sbueringer ! Could you help review this PR https://github.com/kubernetes/kubernetes/pull/129160? It would be great to keep things moving forward.

May 09 '25 11:05 xigang

Huge backlog unfortunately at the moment. I'll try to get to it, but I can't promise it.

May 09 '25 12:05 sbueringer

@sbueringer Thank you.

May 09 '25 13:05 xigang

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Aug 07 '25 13:08 k8s-triage-robot

/remove-lifecycle stale

Aug 07 '25 14:08 halfcrazy

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Nov 05 '25 14:11 k8s-triage-robot

/remove-lifecycle stale

Nov 06 '25 05:11 halfcrazy