website icon indicating copy to clipboard operation
website copied to clipboard

Add feature gate for StorageNamespaceIndex in v1.30

Open tengqm opened this issue 1 year ago β€’ 3 comments

tengqm avatar Jul 24 '24 11:07 tengqm

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please ask for approval from tengqm. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Jul 24 '24 11:07 k8s-ci-robot

Pull request preview available for checking

Name Link
Latest commit 2d5cd443c531e778f0ebedc78d8c61c7a6141cdf
Latest deploy log https://app.netlify.com/sites/kubernetes-io-main-staging/deploys/66a0e6e80852360008fd3ac8
Deploy Preview https://deploy-preview-47259--kubernetes-io-main-staging.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

netlify[bot] avatar Jul 24 '24 11:07 netlify[bot]

/sig api-machinery

cc: @ahutsunshine @kubernetes/sig-api-machinery-misc

Can we please get some help contributing some detail to this feature gate doc? We appreciate it, thanks!

drewhagen avatar Jul 25 '24 16:07 drewhagen

/sig api-machinery

cc: @ahutsunshine @kubernetes/sig-api-machinery-misc

Can we please get some help contributing some detail to this feature gate doc? We appreciate it, thanks!

@drewhagen I'm sorry for the late response. My account @ahutsunshine has some issues, so I register @ahutsunshine1 account to answer this question. Previously we already did a long time discussion and did a benchmark for the namespace indexer feature that shown excellent performance in terms of latency, CPU, and memory in large clusters (like with 100k+ pods) for listing pods within a single namespace with namespace indexers. This implementation significantly reduces cpu and memory consumption, especially when dealing with a small number of pods.

Discussion:

  • https://github.com/kubernetes/kubernetes/issues/120778

Benchmark doc:

  • https://docs.google.com/document/d/1bTxGqlH0tqrpH16NxC2SIUTGMN3_1zNKRiRdIW9Y5VU/edit?usp=sharing

We, eBay Cloud team, already applied the namespace indexer feature to our production and run good performance.

ahutsunshine1 avatar Jul 31 '24 12:07 ahutsunshine1

@ahutsunshine1: Reiterating the mentions to trigger a notification: @kubernetes/sig-api-machinery-misc

In response to this:

/sig api-machinery

cc: @ahutsunshine @kubernetes/sig-api-machinery-misc

Can we please get some help contributing some detail to this feature gate doc? We appreciate it, thanks!

I'm sorry for late response. My account has some issues, so I register https://github.com/ahutsunshine1 account to answer this question. Previously we already did a long time discussion and did a benchmark for the namespace indexer feature that shown excellent performance in terms of latency, CPU, and memory in large clusters (like with 100k+ pods) for listing pods within a single namespace with namespace indexers. This implementation significantly reduces cpu and memory consumption, especially when dealing with a small number of pods.

Discussion:

  • https://github.com/kubernetes/kubernetes/issues/120778

Benchmark doc:

  • https://docs.google.com/document/d/1bTxGqlH0tqrpH16NxC2SIUTGMN3_1zNKRiRdIW9Y5VU/edit?usp=sharing

We, eBay Cloud team, already applied the namespace indexer feature to our production and run good performance.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Jul 31 '24 12:07 k8s-ci-robot

Can you explain what enabling the feature gate means @ahutsunshine1 ?

You're explaining this to an audience that has never looked at one line of Kubernetes code, doesn't know Go syntax, and who aren't familiar with the performance challenges from not applying the fix.

sftim avatar Jul 31 '24 12:07 sftim

Can you explain what enabling the feature gate means @ahutsunshine1 ?

You're explaining this to an audience that has never looked at one line of Kubernetes code, doesn't know Go syntax, and who aren't familiar with the performance challenges from not applying the fix.

@sftim After enabling the feature gate, it benefits the LIST namespace-scoped pods like /api/v1/namespaces/default/pods?resourceVersion=0. Simply:

  1. Before enabling, /api/v1/namespaces/default/pods?resourceVersion=0 will search through all the pods from all namespaces in the cache and then filter the results based on the namespace default. For large clusters with more than 100k+ pods, it will spend lots of cpu and memory with the similar high LIST pods requests even the pod count in default namespace may be small.
  2. After enabling, /api/v1/namespaces/default/pods?resourceVersion=0 will get pods from default namespace indexer directly. It avoids too many unnecessary filter that reduces the cpu and memory usage.

If you want to know more, I would like to recommend to read the benchmark doc with the detail https://docs.google.com/document/d/1bTxGqlH0tqrpH16NxC2SIUTGMN3_1zNKRiRdIW9Y5VU/edit. I'm not sure if I explain properly. Feel free to leave more comments.

ahutsunshine1 avatar Jul 31 '24 12:07 ahutsunshine1

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Oct 29 '24 13:10 k8s-triage-robot