envoy ext_proc: Refactor the management of sidestream response in stream mode.

Huge response(either single large response or large amount of smaller responses in short period) from ext_proc server could lead to OOM risk.

A simple and safe solution proposed here : Leveraging the HCM buffering/watermark to handle the ext_proc server response. When the response cause high watermark, local reply will be triggered to avoid OOM. Note: the request to ext_proc server is still being streamed out.

The potential optimal solution and next step : upstream/downstream applies back pressure to sidestream(that connects to ext_proc server). It is being actively explored and developed but it is a complex solution demands significant effort/test.

Feb 23 '24 05:02 tyxia

As a reminder, PRs marked as draft will not be automatically assigned reviewers, or be handled by maintainer-oncall triage.

Please mark your PR as ready when you want it to be reviewed!

:cat:

Caused by: https://github.com/envoyproxy/envoy/pull/32536 was opened by tyxia.

see: more, trace.

Feb 23 '24 05:02 repokitteh-read-only[bot]

/assign @htuch @yanavlasov

PTAL, Thanks!

Feb 26 '24 00:02 tyxia

As discussed last week, I'm a little worried about the lack of predictability of errors with this solution. I like the fact that this approach protects the Envoy, in particular in a multi-tenant scenario. But, basically arbitrary upstream slowness, which usually would trigger proper flow control, can now cause error codes.

There might be a way to salvage this though - if we can document some strong guarantees, e.g. "if the ext_proc server never sends more than some fixed constant excess bytes, e.g. 10% more bytes to upstream than the client has sent and observed by the ext_proc server" then we can allow ext_proc services to reason about safe mutations that will work within existing flow control expectations and not error out.

Mar 01 '24 18:03 htuch