Feature request: Request limiter
HI,
Was investigating gateway API for Kafka and found Zilla. Thanks for detailed examples which lead straight to the point what Zilla can do and can't.
We are considering to use ZIlla in one of our POC projects where rate limiter is important part.
As feature suggestion would be great to have a rate limiter which is working per IP or per some value in jwt token, similar to "echo:stream scope" is working in https://github.com/aklivity/zilla-examples/tree/main/http.echo.jwt example.
Another good improvement will be shared state about current limits between Zilla peers. Such functionality will block requests on all Zilla entry points at the same time Could be implemented using separate events topic in Kafka Below flow demonstrates an idea: [ZIlla_1] [http client] ->[round robin load balancer] -> [Zilla_2] -> [Kafka] [Zilla_3]
@PGoski many thanks for filing this feature request, it is definitely useful functionality to have in Zilla.
We tend to think of everything on the data path in Zilla as a stream, so in the case of your example http request rate limiting scenario, we would approach that as limiting the number of new streams coming out of the http binding, where each new stream represents a new request.
We also consider throughput throttling in the context of a stream, based on total bytes per unit time, which may be useful to you and others for APIs that are more streaming in nature.
Our first checkpoint for rate limiting functionality would be a general limit spanning across all clients for a given binding within the context of an instance of Zilla engine.
Next we need to identify the uniquely named buckets to measure these limits, so that the counts can be isolated and enforced separately, still within the context of an instance of Zilla engine. Using client provided identifiers, such as client IP address (or network mask), client certificate subject name, or JWT token sub claim, are all good use cases for identifying the bucket name.
As you say, sharing the state across multiple Zilla engines in parallel is needed to enforce the limits across an auto-scaling group in the same region, or even more globally across regions. We are in the process of deciding how best to approach state sharing for these different use cases, and Kafka is definitely one of the choices under consideration.
We would love to learn about your use case. Feel free to join our Slack community to let us know more.
This is the most crucial feature to have. One common scenario is that you want to rate limit a Kafka producer. That way, you can provide a quota for the users not overflow you with rate of requests, Kafka has very weak controls over it, so that would make Zilla a winning use-case, look for example how https://www.gravitee.io/ does that, would be nice to have it open source
This is the most crucial feature to have. One common scenario is that you want to rate limit a Kafka producer. That way, you can provide a quota for the users not overflow you with rate of requests, Kafka has very weak controls over it, so that would make Zilla a winning use-case, look for example how https://www.gravitee.io/ does that, would be nice to have it open source
Thanks for your comment @ialexivy .
The internals of Zilla use efficient flow control with back pressure, even across CPU cores, to smooth the data flows and eliminate unnecessary buffering at intermediate stages in the pipeline.
In the Zilla kafka cache_client binding, multiple clients producing to the same Kafka topic partition share an inbound queue with back pressure and fairness. The kafka cache_server binding then batches messages as appropriate to send to the Kafka broker elected as the topic partition leader.
Given that flow control is so built-in to the Zilla internals, our general approach to rate limiting would take advantage of this and manage the timing of flow control credits accordingly to honor the intended maximum rate limit.
Assuming we implemented this much as described in the comment above, we would like to understand if there are additional controls needed for your use case.