🔹 Jira Ticket(s)

SDK-199

✏️ Description

Fixes auth retry for expired JWT refresh: scheduled token refresh no longer bypasses pauseAuthRetries or authRetryPolicy.maxRetry, preventing runaway/infinite onAuthTokenRequested loops.

Added unit tests to lock pause + maxRetry behavior on scheduled refresh, and reset retry count after a successful refresh is queued.

Dec 15 '25 17:12 joaodordio

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 69.48%. Comparing base (f896dc7) to head (4a72ae5).

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #987      +/-   ##
==========================================
+ Coverage   69.45%   69.48%   +0.03%     
==========================================
  Files         109      109              
  Lines        8944     8953       +9     
==========================================
+ Hits         6212     6221       +9     
  Misses       2732     2732

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:

:snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Dec 15 '25 17:12 codecov[bot]

Pull Request Review Comment

Please Consider Refactoring to Option 1 from the Ticket

The current fix works, but I strongly recommend simplifying to the Option 1 approach that was outlined in the ticket description. Here's why:

The Core Problem

This bug exists because shouldIgnoreRetryPolicy: true bypasses safety limits (pauseAuthRetry and maxRetry). The ticket correctly identifies these as critical controls that prevent infinite loops and battery drain.

The question is: should we ever allow these safety limits to be bypassed?

Current Implementation: Selective Bypass

Your fix introduces a flag isInScheduledRefreshCallback to detect when we're in a scheduled refresh, then enforces safety limits only in that specific case:

private func shouldPauseRetry(_ shouldIgnoreRetryPolicy: Bool) -> Bool {
    if pauseAuthRetry {
        return true
    }
    
    // Only enforces maxRetry for scheduled refreshes
    if isInScheduledRefreshCallback {
        return retryCount >= authRetryPolicy.maxRetry
    }
    
    // Still allows bypass in other cases
    return retryCount >= authRetryPolicy.maxRetry && !shouldIgnoreRetryPolicy
}

This means: Any code path that calls requestNewAuthToken(shouldIgnoreRetryPolicy: true) outside of scheduled refreshes can still bypass maxRetry. You've fixed the symptom (scheduled refresh infinite loop) but left the backdoor open.

Option 1: Always Enforce Safety Limits

private func shouldPauseRetry(_ shouldIgnoreRetryPolicy: Bool) -> Bool {
    if pauseAuthRetry {
        return true
    }
    
    if retryCount >= authRetryPolicy.maxRetry {
        return true
    }
    
    return false
}

This means: No code path can bypass safety limits, ever. Period.

Why This Is Better

1. Prevents ALL infinite loops, not just scheduled refresh loops

If tomorrow someone adds code that calls requestNewAuthToken(shouldIgnoreRetryPolicy: true) in a different context (error handler, manual retry, etc.), your current fix won't protect them. Option 1 would.

2. No state management complexity

Your approach requires:

A new instance variable
Setting the flag before the call
Clearing the flag after the call
Future developers understanding this flag exists and what it means

Option 1 requires: none of that.

3. Thread-safe by default

Flag management patterns can be racy if methods get called from multiple threads. Option 1 has no such risk.

4. Matches the stated intent

The ticket says: "pauseAuthRetry and maxRetry are safety controls that should ALWAYS be respected."

Your implementation respects them sometimes. Option 1 respects them always.

5. Fewer lines of code

Current: 12 lines in shouldPauseRetry() + flag declaration + flag management
Option 1: 8 lines total

Simple code is maintainable code.

What About `shouldIgnoreRetryPolicy`?

I think there's confusion about what this parameter is supposed to do. Based on the name and ticket context, it seems like it was meant to bypass retry backoff delays, not safety limits.

If the intent is "try again immediately without waiting," that's fine. But that's different from "ignore maxRetry and keep trying forever."

Proposal: Safety limits (pause + maxRetry) should be absolute. If we need to bypass backoff timing, that's a separate concern and should be named/handled differently.

The Root Cause Matters

The ticket identifies the root cause perfectly:

"The shouldIgnoreRetryPolicy parameter was likely introduced to allow immediate retries in certain scenarios (like manual user-triggered refreshes), but inadvertently became part of the automatic token expiration flow, bypassing critical safety controls."

The fix shouldn't be to add a special case that detects when we're in the problematic flow. The fix should be to remove the ability to bypass safety controls because that's fundamentally dangerous regardless of the flow.

What If We Need to Bypass?

If there's a legitimate use case where we need to retry beyond maxRetry, that should be:

Explicitly documented in code comments
A separate code path with a different method name
Something we consciously choose, not accidentally enable via a boolean parameter

Right now, shouldIgnoreRetryPolicy: true is a loaded gun sitting in the codebase. Your fix adds a safety on one trigger, but the gun is still loaded.

Bottom Line

Option 1 is:

Simpler (less code, no state management)
Safer (no infinite loop scenarios, thread-safe)
Clearer (safety limits are always enforced)
More maintainable (future developers can't bypass accidentally)

The only downside is it changes behavior for all callers of shouldIgnoreRetryPolicy: true. But that's actually a feature because bypassing safety limits was the bug in the first place.

Please consider refactoring to Option 1. If there's a specific scenario that requires bypassing maxRetry, let's discuss it explicitly so we can design a proper solution.

Dec 18 '25 16:12 sumeruchat

Agree with Claude that we should take the simpler approach @joaodordio

@sumeruchat We need to have a bigger discussion about this.

The solution you are recommending based on Claude, removes the shouldIgnoreRetryPolicy flag from the implementation.

We need to further discuss offline why we need that flag and what the purpose of it is.

Dec 19 '25 10:12 joaodordio

SDK-199 Auth Retry policy completely bypassed

🔹 Jira Ticket(s)

✏️ Description

Codecov Report

Pull Request Review Comment

Please Consider Refactoring to Option 1 from the Ticket

The Core Problem

Current Implementation: Selective Bypass

Option 1: Always Enforce Safety Limits

Why This Is Better

What About shouldIgnoreRetryPolicy?

The Root Cause Matters

What If We Need to Bypass?

Bottom Line

What About `shouldIgnoreRetryPolicy`?