Connection: support timeout to close connections stuck above buffer high watermark
Commit Message: Add timeout guard for connections stuck above buffer high watermark
- extend listener and cluster APIs with
per_connection_buffer_high_watermark_timeout, allowing operators to cap how long downstream or upstream connections can remain at/above their configured buffer high watermark - close connections when their buffers stay at/above the watermark past the configured timeout
Background: In our large (internal) multi-tenant use case where Envoy runs as a host-agent shared by all the pods on the host, a buggy pods can accept connections but fail to drain them, leaving Envoy buffers full until overload-manager kicks in or OOMs. This change clamps such stalled connections once they remain fully buffered for the configured duration.
Additional Description: N/A Risk Level: Low Testing: Unit tests Docs Changes: N/A Release Notes: N/A Platform-Specific Features: N/A
CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @mattklein123
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).
Assigning @botengyao for senior-maintainer review /assign @botengyao
/lgtm api