inference icon indicating copy to clipboard operation
inference copied to clipboard

Add enforce_max_duration setting

Open anhappdev opened this issue 3 months ago • 3 comments

This PR introduces a new enforce_max_duration setting to the LoadGen test configuration. This allows users to control whether exceeding max_duration should terminate query issuance early and how minimum query count validation is applied.

Key Changes

•	Exposes enforce_max_duration in Python bindings and test settings (default: true).
•	IssueQueryController only stops early and logs when enforcement is enabled.
•	Results logic updates: min_query_count is skipped when enforcement is disabled.
•	Effective settings logging updated to include the new flag.
•	Submission checker ensures official submissions must enable enforcement.

The changes are taken from the branch https://github.com/mlcommons/inference/commits/mobile_update/, which is now outdated and therefore not possible to merge into master without resolving a conflict.

Motivation

We've maintained this change in a separate branch called mobile_update until now. This makes it difficult to update the loadgen version, so we want to merge this change into the master branch.

Related issues:

https://github.com/mlcommons/mobile_app_open/pull/798 https://github.com/mlcommons/inference/pull/1621

anhappdev avatar Nov 21 '25 03:11 anhappdev

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

github-actions[bot] avatar Nov 21 '25 03:11 github-actions[bot]

@pgmpablo157321 @freedomtan Please review this PR.

anhappdev avatar Nov 21 '25 03:11 anhappdev

LGTM

arjunsuresh avatar Nov 24 '25 19:11 arjunsuresh