bb8 Use Notify instead of InternalsGuard

Potential fix for https://github.com/djc/bb8/issues/154

Problem

InternalsGuard is problematic because of the deadlock. Using a ReentrantLock is not possible because both put and drop are holding a mutable reference to the internals. Using RefCell won't work also, because it doesn't allow two or more mutable borrows, which is what's required to make this work, a violation of Rust safety guarantees.

Possible Solution

Switch to using Tokio's Notify that provides a fair queue for waiters. There is no more internal pool guarantee that connections are fairly given to the tasks that have waited the longest, but this fairness may be good enough if the Tokio scheduler and the kernel scheduler ensure fairness as well. Starvation is possible if the caller infinitely retries calls to get.

Additionally, the entire make_pooled function is timed out using the connection_timeout setting. This ensures that there is no internal starvation & that is_valid() is timed out as well; if it isn't, we can starve all tasks & block the caller indefinitely while waiting for a promise.

Open Questions

The error forwarding is not clear to me. I removed it, but I'm not entirely sure what it does. May need some help here to understand if I broke something.

May 27 '23 18:05 levkk

Codecov Report

Patch coverage: 96.96% and project coverage change: +0.84 :tada:

Comparison is base (ea5c162) 72.12% compared to head (0dc8eda) 72.96%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #164      +/-   ##
==========================================
+ Coverage   72.12%   72.96%   +0.84%     
==========================================
  Files           6        6              
  Lines         599      603       +4     
==========================================
+ Hits          432      440       +8     
+ Misses        167      163       -4

Impacted Files	Coverage Δ
bb8/src/inner.rs	`87.38% <95.91%> (+1.06%)`	:arrow_up:
bb8/src/internals.rs	`95.45% <100.00%> (+2.69%)`	:arrow_up:

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

May 27 '23 18:05 codecov[bot]

Has there been any progress with the tests? This PR have any issues as 'blocked' waiting for some discussion or support ... ?

Jun 18 '23 19:06 dertin

Not really, just waiting for djc to take a look.

Jun 19 '23 05:06 levkk

Sorry for taking so long to look at this. So if I understand correctly, this fix will make the pool less fair? And also, more tasks have to do work if a number of them are waiting for a connection to become available, whereas previously they would just spend that time sleeping?

Because, while the previous design with waiters made it so that earlier requesters would be sent the freed up connection, in this design all waiters are notified and then make another attempt at getting a connection?

Sep 26 '23 13:09 djc

I don't think so. Notify is fair according to the docs and notify_one notifies only one task, not all of them.

Sep 26 '23 16:09 levkk

So if I understand correctly, this fix will make the pool less fair?

I took a look at the implementation of Notify. It uses a FIFO queue [1] [2] for tasks, so the first task to .await will be the first one to be woken up. I think that's good enough to maintain fairness, since the pool will never wake up more tasks than there are available connections. The only way I can think we can have a task skip the queue is a bug in the tokio scheduler.

[1] https://docs.rs/tokio/latest/src/tokio/sync/notify.rs.html#985 [2] https://docs.rs/tokio/latest/src/tokio/sync/notify.rs.html#732

Jan 04 '24 08:01 levkk

I tried to simplify this a bit, please take a look at #186.

Jan 26 '24 11:01 djc