varnish-cache icon indicating copy to clipboard operation
varnish-cache copied to clipboard

Magic 0.5 second HTC_RxStuff() timeout in h2_rxframe()

Open nigoroll opened this issue 11 months ago • 4 comments

There is a seemingly magic 0.5 seconds timeout here, added in 41ef373af53571a94ea8f73f0538322270799a84. @dridi can you please help my understand why it has this particular value?

https://github.com/varnishcache/varnish-cache/blob/37a8bebf0f7fb0a68af9669af57ff094b8114ee0/bin/varnishd/http2/cache_http2_proto.c#L1480-L1481

edit: I had been asked why I said this would be "busy looping" (waking up every 0.5 seconds), here's an (accidental) exhibit: https://github.com/varnishcache/varnish-cache/pull/4284#issuecomment-2694935012 I really think we should calculate the proper timeout and use that, rather than "opportunistically" return from the receive just to check timeouts.

nigoroll avatar Feb 21 '25 08:02 nigoroll

bugwash: @bsdphk asked if this timeout should not be rather passed as ti instead of tn ? Pinging @dridi for when he is back

walid-git avatar Feb 24 '25 15:02 walid-git

Apologies for not replying earlier, I meant to reply something before being away for a while.

The reason for this arbitrary 500ms timeout is h2_window_timeout. When a stream is waiting for window credits, we need to eventually trigger a timeout. Unfortunately, when the h2_sess thread when it is stuck in a blocking read, it can no longer coordinate with h2_req threads.

The fact that one h2_sess thread is in charge of reads while h2_req threads are in charge of writes complicates the picture. Overall, the current mix of IO and locking composes poorly.

We are currently evaluating different design ideas to improve the dynamics between h2 sessions and streams.

dridi avatar Mar 05 '25 09:03 dridi

We are currently evaluating different design ideas to improve the dynamics between h2 sessions and streams.

I think we should use the new request disembark and handle all h2 read and write in a single thread.

nigoroll avatar Mar 05 '25 16:03 nigoroll

Irrespective of what I think a proper solution is, I would like to understand what you are saying, @dridi

The reason for this arbitrary 500ms timeout is h2_window_timeout. When a stream is waiting for window credits, we need to eventually trigger a timeout.

Can't this be done in a better way even with the current model? Shouldn't the stream thread implement the timeout? Shouldn't the session thread use as a timeout the minimum of (remaining) window timeouts and the session timeout?

nigoroll avatar Mar 05 '25 16:03 nigoroll