Add retry
Hey! Occasionally, our login step fails due to rate-limiting or the system we're logging into being unavailable. It would be super useful to have a retry flag of some kind. Based on similar functionality for other actions in the space, imagining it might look like:
- name: Login
uses: docker/login-action@v3
with:
registry: ...
username: ...
password: ...
retries: 3
+1
This would be fantastic. I have actions occasionally fail with Client.Timeout exceeded while awaiting headers and just rerunning the step does the trick in those instances.
same! our Harbor registry times out ALL the time and I have to go manually re-trigger the workflow
our login step fails due to rate-limiting
Hammering a registry of requests because it's unavailable is not a short or long term solution. Registries often implement rate limiting to prevent abuse and ensure fair usage among all users. If a login attempt fails due to rate limiting, retrying the login multiple times in quick succession can exacerbate the problem. This can lead to further rate limiting, potentially locking out the user for a longer period or affecting other users who are trying to access the registry. If the registry is temporarily unavailable due to maintenance or an outage, retrying the login multiple times will not resolve the issue. This can result in unnecessary load on the registry once it becomes available again, as multiple clients may attempt to reconnect simultaneously.
While retry logic can be useful, it's important to implement it thoughtfully to avoid exacerbating issues like rate limiting and system unavailability. Using smth like exponential backoff, circuit breaker patterns could be a solution.
I would consider to add more status code for the retries as an optional property in the action. There are situations where a 401 is returned from a custom registry as a security mechanism to avoid giving hints and you may want to retry those as well (of course with a backoff retry policy in place...). Maybe not the default but overridable would be great
hi! here is my second approach to that issue with more dedicated time rather than 30 minutes in the train I had for the first attempt :)
@crazy-max can you take a look https://github.com/docker/login-action/pull/849 and let me know what you think about that approach?
@crazy-max, this would be very beneficial to have. Is there anything we can help do to the PR from @fedordikarev merged?
Please add the retry! It's so annoying when you having a con job, triggered at night, failing because of this!
@crazy-max Could we get some eyes on: https://github.com/docker/login-action/pull/849 please?