varnish-cache Backend connection queue

This patch allows a task to be queued when a backend reaches its max_connections. The task will queue on the backend and wait for a connection to become available, rather than immediately failing. This capability is off by default and must be enabled with new parameters.

The following parameters have been added: backend_wait_timeout: the amount of time a task will wait (default 0.0). backend_wait_limit: the maximum number of tasks that can wait (default 0).

The two parameters can also be overriden for individual backends in vcl:

    backend foo {
        .host = "bar.com";
        .wait_timeout = 3s;
        .wait_limit = 10;
    }

Authored by: @drodden (with minor rearrangements)

Dec 14 '23 11:12 walid-git

With this change, it seems that the default value for thread_pool_stack on 32 bit systems is no more sufficient (reason why e00029.vtc is failing on ubuntu_bionic).

Dec 14 '23 13:12 walid-git

As an initial matter I would prefer queue_length and queue_timeout which I think are almost self-explanatory.

I'm not sure I think it is a good idea however, but default-off mitigates that.

Dec 18 '23 12:12 bsdphk

FWIW this is a partial implementation of what we agreed on for VIP 31.

https://github.com/varnishcache/varnish-cache/wiki/VIP31%3A-Backend-connection-queue

There are two loose points not covered, disembarking and health status. Disembarking fetch tasks is a large project, one we can disregard or move to its own VIP. The saturation of both .max_connections and .wait_limit/backend_wait_limit is probably reasonable to address before closing VIP 31.

Dec 18 '23 13:12 dridi

Thank you, this looks good to me overall.

I would still have some suggestions, but would also be OK to polish after merge, if you agree:

struct backend_cw should have a magic value which we check for access from other threads. While I did suggest to move it to the stack, I am well aware of the risk of smashing foreign stacks, and magic checks can help raise a flag if that happens.
the magic would be initialized with INIT_OBJ(), which would also initialize the list pointers for clarity. The code is still correct, IIUC, even with uninitialized list pointers, but I think it helps debugging a lot if they are zeroed.
cw_state should be moved into struct backend_cw. This allows for additional assertions (e.g. in vbe_connwait_signal_locked, we can assert that the state is CW_QUEUED.
when we dequeue, we should change the state to a new state, e.g. CW_DEQUEUED.
We should add a backend_cw_fini function which asserts that the state is != CW_QUEUED before destroying the condition variable.
Regarding the inlining of the dequeue, I think I would actually prefer a vbe_connwait_dequeue_locked for clarity, because that would now also change the state.

Jan 15 '24 15:01 nigoroll

Rebased and addressed all review comments. Ready for a (hopefully) last review.

Feb 05 '24 11:02 walid-git

I have addressed most of the last review items, and mentioned the potential drawbacks of this feature in the docs as requested during last bugwash.

Feb 26 '24 11:02 walid-git

bugwash:

proposed (re)name:

global parameters:

backend_queue_limit
backend_queue_timeout

VCL:

backend foo {
  .queue_limit = 42;
  .queue_timeout 1m;
}
sub vcl_backend_fetch {
  set bereq.queue_limit = 42;
  set bereq.queue_timeout = 1m;
}

Mar 01 '24 15:03 nigoroll

hi all! I'd really like this to get into the next release, and from what I'm reading, it's only a naming exercise from now on.

As a refresher, the current PR offers backend_wait_timeout/backend_wait_limit and the bugwash proposed backend_queue_timeout/backend_queue_limit. I feel like this isn't a big enough contention point to block a merge.

Any chance to get the original names in? I hate to bring it up and I'll probably get slapped for it but: we have customers using the feature and consistency is pretty important, I don't want to have to translate parameter names depending on the platform people are running.

May 15 '24 19:05 gquintard

bugwash approved. please implement connection purge on health change to sick

Jul 09 '24 12:07 nigoroll

Bugwash: when a probe transitions effectively to sick, the queue should be cleared to let tasks fail immediately.

Effective transitions to sick:

probe goes sick, not overridden by CLI (backend.set_health healthy)
backend.set_health sick (where previously healthy from probe or CLI)
backend.set_health probe (with sick probe, where previously healthy from CLI)

Jul 09 '24 12:07 dridi

PR updated:

e8f35fb changes wait_timeout to match with new timeout defaults.
871f784 Notify client requests in the wait queue when the backend goes sick (as agreed during bugwash)

Jul 15 '24 10:07 walid-git

bugwash: merge but the last commit, open new PR for the flush

Jul 15 '24 13:07 nigoroll

As per bugwash: squashed e8f35fb and removed 871f784 from this PR (to be submitted separately). Will merge once CCI is done.

Jul 15 '24 14:07 walid-git