playwright icon indicating copy to clipboard operation
playwright copied to clipboard

[Feature] Exponential backoff on retries

Open 403-html opened this issue 2 years ago • 12 comments

Context:

Currently, Playwright implements a linear retry mechanism for tests, as outlined in its documentation. This approach is invaluable for handling flakiness in tests due to transient issues. However, the existing retry mechanism operates on a linear timeout basis, which can sometimes be inefficient and put unnecessary load on servers, especially in cases of persistent transient issues.

Feature Request:

I propose the implementation of an Exponential Backoff Timeout mechanism for all retries within Playwright (expect/action/global?). Exponential backoff is a standard algorithm that gradually increases the timeout between retries, which can significantly improve the efficiency of handling retry attempts under fluctuating network conditions or server states.

Key Benefits:

  • Reduced Server Load: By spacing out retry attempts more effectively, this approach can help in reducing the load on servers, especially when many tests are running simultaneously and failing due to common transient issues.

  • Improved Test Performance: Exponential backoff can minimize the time wasted in rapid, successive retries, leading to overall faster test completion times in scenarios with intermittent failures.

  • Better Resource Management: This approach can lead to more efficient use of CI/CD pipeline resources, as it avoids rapid, successive retries that may not be necessary if the issue is temporarily persisting.

Example usage

import { defineConfig } from '@playwright/test';

export default defineConfig({
  retries: 2,
  retryStrategy: 'exponentialBackoff',
  exponentialBackoffConfig: {
    initialDelay: 1000, // Initial delay in milliseconds
    growthFactor: 4,    // The factor by which the delay increases after each retry (from example it'd be 1000ms, 2000ms, 4000ms, 8000ms of waiting)
    maxDelay: 30000     // Maximum delay in milliseconds between retries, so even if growth hits it, it'll be maximum used, so tests aren't rolling forever
  }
});

403-html avatar Jan 04 '24 11:01 403-html

Looks similar https://github.com/microsoft/playwright/issues/23354.

mxschmitt avatar Jan 04 '24 18:01 mxschmitt

I mean that's tailing the flaky tests, so adding config when flaky tests should be runned (in his example - at the end of whole run), but here it's more about timing of the retry config, as now we have available only linear retries timeouts.

403-html avatar Jan 04 '24 18:01 403-html

+1

zargham-k avatar Mar 05 '24 16:03 zargham-k

Definitely a supporter of adding this feature!

natesuri-sg avatar Mar 14 '24 15:03 natesuri-sg

+1

darlenew avatar Mar 22 '24 21:03 darlenew