feat: rate limiting based on upstream replicas
Description: It would be incredibly beneficial if rate limits could be dynamically adjusted based on the number of replicas of the upstream service/application. This would be a powerful approach for applications that combine autoscaling through KEDA (https://keda.sh/) with rate limiting.
E.g: Deployment of 3 replicas of the App Rate limits for the App are: 6 req/sec (e.g.: per client) I change (or KEDA changes) amount of replicas of the App from 3 to 6 and It would be great if rate limiting pod reflects this change and edit amount of rate limits to 12 req/sec (default rate limits x deployment replicas).
@s0uky The relationship between the number of application replicas and limit bucket seems more like an application-level logic. Would it be possible to crates a "controller" that modifies the limit in the BackendTrafficPolicy to reflect the the application scaling?
@zhaohuabing I'm having difficulty determining the appropriate location for this controller. It seems likely that this controller could be a component of the Gateway itself. In that case, it would be responsible for defining the name of the deployment to monitor for replicas and modify the limit dynamically in the BackendTrafficPolicy .
the use case sounds like Local RateLimit https://gateway.envoyproxy.io/docs/tasks/traffic/local-rate-limit/
I don't think so, local rate limit is base on the replicas of gateway, what @s0uky want is basing on upstream's replicas.
ah thanks for the clarification
@zhaohuabing I'm having difficulty determining the appropriate location for this controller.
Hi @s0uky You can deploy your controller by the side of EG to handle this.
@zhaohuabing We have to decide about creation of this controller. If yes, can we submit a pull request to the upstream Envoy Gateway project? Or is it not make sense to you?
@s0uky While I agree that dynamically adjusting the rate limit based on the number of replicas of the upstream service/application is a valuable feature, I'm not entirly sure if it should be directly implemented in Envoy Gateway. Beyond the replica number, there are also scenairos that rate limiting could be driven by other application-level metrics, such as
- Resource scaling up (CPU, Memory, etc.)
- Request latency
- Failed request ratio
- ....
I would suggest creating a controller to run alongside EG to adjust the limit of BTP based on those application metrics. @envoyproxy/gateway-maintainers please chim in.
agree with @zhaohuabing, this is hard to generize within EG since its a use case specific combination of dynamic rate limiting and load balancing settings based on upstream replicas
This issue has been automatically marked as stale because it has not had activity in the last 30 days.