ratelimit icon indicating copy to clipboard operation
ratelimit copied to clipboard

Memory leak when RateLimitOverride is set with uniq ip on every call to ShouldRateLimit method

Open zdmytriv opened this issue 4 years ago • 6 comments

In our service we use envoyproxy/ratelimit as sidecar and are making grpc calls to ShouldRateLimit method explicitly without envoy proxy.

Simplified config looks like

---
domain: app
descriptors:
  - key: ip
    rate_limit:
      unit: minute
      requests_per_unit: 1000

In our request we specify Limit (limit override) - https://github.com/envoyproxy/java-control-plane/blob/main/api/src/main/proto/envoy/extensions/common/ratelimit/v3/ratelimit.proto#L93 on every request which lead to this codepath getting executed - https://github.com/envoyproxy/ratelimit/blob/main/src/config/config_impl.go#L252-L261

In that if statement when unique IP is getting used on every request we are getting unique rateLimitKey and also as a result are getting new set of stats Counters this.statsManager.NewStats(rateLimitKey),. So number of stats Counters and keys always grows over time.

  • Stats should not be collected at all if USE_STATSD is set to false.
  • Stats should expire and stats cache should get cleaned up over time

UPDATE:

our simplified ratelimit request to looks like this:

{
    "domain": "app",
    "descriptors": [
        {
            "entries": {
            	"key": "our_entry_key",
            	"value": "customer1234_1.2.3.4"
            },
            "limit": {
            	"requests_per_unit": 1000,
            	"unit": "MINUTE"
            }
        }
    ],
    "hits_addend": 1
}

^ note 2 things

  • we have limit defined in each request and is different based on customer. In this case customer is customer1234
  • key is built by <customer_identifier>_<ip>

Because limit is defined this code path is getting executed https://github.com/envoyproxy/ratelimit/blob/main/src/config/config_impl.go#L252-L261 which executes this.statsManager.NewStats(rateLimitKey) which will build whole bunch of new stats counters based on key (customer1234_1.2.3.4)

zdmytriv avatar Aug 02 '21 08:08 zdmytriv

At one point we did not collect stats on a per key basis for this very reason. Has this regressed? Or do you have per-key/detailed stats enabled?

mattklein123 avatar Aug 02 '21 15:08 mattklein123

It looks like regression. There is no settings (settings.go) like ENABLE_PER_KEY_STATS but it would be great to have one.

zdmytriv avatar Aug 03 '21 11:08 zdmytriv

Hmm, now I'm confused and I don't remember the history. I was thinking about https://github.com/envoyproxy/ratelimit/issues/181 which is not yet implemented. Is this a regression from https://github.com/envoyproxy/ratelimit/pull/242? We definitely should not be making a new metric for every key/value pair and we need to figure out when that was introduced and revert it.

mattklein123 avatar Aug 03 '21 15:08 mattklein123

issue has been introduced here in https://github.com/envoyproxy/ratelimit/pull/158

GetLimits --> NewRateLimit --> newRateLimitStats

zdmytriv avatar Aug 04 '21 12:08 zdmytriv

OK, thanks for tracking down. cc @Pchelolo.

We definitely should not be creating a stat per key/value by default. @zdmytriv are you interested in fixing? "Detailed stats" should be opt-in only per https://github.com/envoyproxy/ratelimit/issues/181.

cc @ysawa0

mattklein123 avatar Aug 04 '21 15:08 mattklein123

are you interested in fixing?

I'm new to GO so probably not the best candidate

zdmytriv avatar Aug 09 '21 12:08 zdmytriv