Exponential backoff and dial retrying
This is a revival of the previous PR https://github.com/bradfitz/gomemcache/pull/54
Previous PR was outdated and requires a rebase on the latest code so it can be cleanly merged
discussed with @QuChen88 elsewhere - they're going to fix up the change more and re-submit.
Re-submitted the commit with only the exponential backoff and dial retrying change, and without the connection pooling change.
@QuChen88 Thanks! Existing tests pass.
Any chance you could add specific tests to verify this feature?
Almost forgot: #115 - this looks like a similar issue with a different solution, a circuit breaker instead of a manual backoff. @QuChen88 did you see this other issue perhaps and have any thoughts on this?
The change they made pulls in a CB repo and seems more intrusive. I'm not sure offhand which approach is cleanest.
It is not easy to add a reliable test for this as this is mainly about retry logic on connection failure.
For #115, looks like it is related to the PR for custom dialer which was already merged back in 2023 (See #158). And it appears to be trying to solve a similar problem of not aggressively retry when there are issues with the server.
I personally prefer to retry from outside of dial() with backoff logic.
Can you test against a non-existent server and ensure it can't reconnect instantly? that shouldn't be too hard.
If it is we might have to fix some other things first (looks like it's eating errors in certain conditions)