gomemcache icon indicating copy to clipboard operation
gomemcache copied to clipboard

Exponential backoff and dial retrying

Open QuChen88 opened this issue 10 months ago • 6 comments

This is a revival of the previous PR https://github.com/bradfitz/gomemcache/pull/54

Previous PR was outdated and requires a rebase on the latest code so it can be cleanly merged

QuChen88 avatar Mar 27 '25 06:03 QuChen88

discussed with @QuChen88 elsewhere - they're going to fix up the change more and re-submit.

dormando avatar Apr 04 '25 00:04 dormando

Re-submitted the commit with only the exponential backoff and dial retrying change, and without the connection pooling change.

QuChen88 avatar Apr 04 '25 00:04 QuChen88

@QuChen88 Thanks! Existing tests pass.

Any chance you could add specific tests to verify this feature?

dormando avatar Apr 04 '25 01:04 dormando

Almost forgot: #115 - this looks like a similar issue with a different solution, a circuit breaker instead of a manual backoff. @QuChen88 did you see this other issue perhaps and have any thoughts on this?

The change they made pulls in a CB repo and seems more intrusive. I'm not sure offhand which approach is cleanest.

dormando avatar Apr 04 '25 01:04 dormando

It is not easy to add a reliable test for this as this is mainly about retry logic on connection failure.

For #115, looks like it is related to the PR for custom dialer which was already merged back in 2023 (See #158). And it appears to be trying to solve a similar problem of not aggressively retry when there are issues with the server.

I personally prefer to retry from outside of dial() with backoff logic.

QuChen88 avatar Apr 04 '25 05:04 QuChen88

Can you test against a non-existent server and ensure it can't reconnect instantly? that shouldn't be too hard.

If it is we might have to fix some other things first (looks like it's eating errors in certain conditions)

dormando avatar Apr 04 '25 05:04 dormando