node icon indicating copy to clipboard operation
node copied to clipboard

test: Add --repeat-until-n-failures flag to test runner

Open FelixVaughan opened this issue 8 months ago • 3 comments

What is the problem this feature will solve?

Currently, debugging flaky tests requires either:

  1. Manually running tests repeatedly
  2. Using --repeat with a fixed number of iterations
  3. Writing custom test wrapper code

This makes it difficult to:

  • Reliably reproduce failures
  • Gather statistics about test flakiness
  • Automate flaky test detection in CI

While V8's test runner has some flaky test handling capabilities, Node.js's main test runner lacks a built-in way to repeatedly run tests until a specific number of failures occur. The main use-case of course, would be to run until one failure is reached as opposed to having to repeatedly run tests hoping they fail, which even with the --repeated flag is rather tedious. A debugging flow I use is to run a test until a failure, then using a breakpoint at the failure, inspect the variables and call stack.

What is the feature you are proposing to solve the problem?

Add a new --repeat-until-n-failures flag to tools/test.py that allows tests to be repeated until a specified number of failures occur or a maximum iteration count is reached.

Proposed syntax: --repeat-until-n-failures=<n_failures>[,max_iterations]

Examples:

Run until 1 failure occurs

python tools/test.py --repeat-until-n-failures=1 test/path

Run until 5 failures or 1000 iterations

python tools/test.py --repeat-until-n-failures=5,1000 test/path

The feature would:

  1. Run the specified test(s) repeatedly
  2. Track the number of failures
  3. Stop when either:
    • The specified number of failures is reached
    • The maximum iteration count (if specified) is reached
  4. Report statistics about the failures (iteration counts, failure rate, etc.)

This would complement the existing --repeat and --exit-after-n-failures flags by providing a focused tool for debugging flaky tests and measuring test reliability.

I am happy to work on implementing this feature if there is interest.

What alternatives have you considered?

  1. Using existing --repeat flag with a large number:

    • Less efficient as it always runs all iterations
    • Doesn't stop automatically on failure
    • Harder to analyze results
  2. Personal custom test script:

    • More fragile, and less extensibility
    • Maintenance burden as test runner evolves - need to update parsing logic with each change
    • Isolated solution that doesn't benefit the Node.js ecosystem

FelixVaughan avatar Apr 24 '25 04:04 FelixVaughan

@FelixVaughan are you planning to work on this, or else I can resolve this!

0hmX avatar Jun 07 '25 05:06 0hmX

@0hmX yes! I left this up to get some initial discussion, but happy to start on an implementation

FelixVaughan avatar Jun 08 '25 01:06 FelixVaughan

There has been no activity on this feature request for 5 months. To help maintain relevant open issues, please add the https://github.com/nodejs/node/labels/never-stale label or close this issue if it should be closed. If not, the issue will be automatically closed 6 months after the last non-automated comment. For more information on how the project manages feature requests, please consult the feature request management document.

github-actions[bot] avatar Dec 05 '25 01:12 github-actions[bot]