TensorRT-LLM chore: [TRTLLM-3694] Move functional args to llmargs

Moving args shared across TRT & Torch backend out from BuildConfig to first-level LLMArgs as discussed w/ Chunwei.

Mar 24 '25 18:03 hchings

/bot run

Mar 24 '25 18:03 hchings

PR_Github #325 [ run ] triggered by Bot

Mar 24 '25 19:03 niukuo

PR_Github #325 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #306 completed with status: 'FAILURE'

Mar 24 '25 20:03 niukuo

/bot run

Mar 24 '25 21:03 hchings

PR_Github #332 [ run ] triggered by Bot

Mar 24 '25 21:03 niukuo

PR_Github #332 [ run ] completed with state FAILURE /LLM/main/L0_MergeRequest_PR pipeline #312 completed with status: 'SUCCESS'

Mar 25 '25 00:03 niukuo

/bot run

Mar 26 '25 06:03 hchings

PR_Github #528 [ run ] triggered by Bot

Mar 26 '25 06:03 niukuo

PR_Github #528 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #451 completed with status: 'FAILURE'

Mar 26 '25 08:03 niukuo

/bot run

Mar 26 '25 22:03 hchings

PR_Github #619 [ run ] triggered by Bot

Mar 26 '25 22:03 tensorrt-cicd

PR_Github #619 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #521 completed with status: 'FAILURE'

Mar 27 '25 00:03 tensorrt-cicd

/bot run

Mar 27 '25 03:03 hchings

PR_Github #633 [ run ] triggered by Bot

Mar 27 '25 03:03 tensorrt-cicd

PR_Github #633 [ run ] completed with state FAILURE /LLM/main/L0_MergeRequest_PR pipeline #532 completed with status: 'FAILURE'

Mar 27 '25 04:03 tensorrt-cicd

/bot run

Mar 27 '25 05:03 hchings

PR_Github #642 [ run ] triggered by Bot

Mar 27 '25 05:03 tensorrt-cicd

PR_Github #642 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #542 completed with status: 'FAILURE'

Mar 27 '25 07:03 tensorrt-cicd

/bot run

Mar 28 '25 00:03 hchings

PR_Github #662 [ run ] triggered by Bot

Mar 28 '25 00:03 tensorrt-cicd

PR_Github #662 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #557 completed with status: 'FAILURE'

Mar 28 '25 01:03 tensorrt-cicd

/bot run

Mar 28 '25 04:03 hchings

PR_Github #671 [ run ] triggered by Bot

Mar 28 '25 04:03 tensorrt-cicd

PR_Github #671 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #564 completed with status: 'SUCCESS'

Mar 28 '25 11:03 tensorrt-cicd

/bot reuse-pipeline

Mar 28 '25 18:03 hchings

PR_Github #682 [ reuse-pipeline ] triggered by Bot

Mar 28 '25 18:03 tensorrt-cicd

PR_Github #682 [ reuse-pipeline ] completed with state SUCCESS Reusing PR_Github #671 for commit e9b9826

Mar 28 '25 18:03 tensorrt-cicd

@hchings should LLM() instantiations in tensorrt_llm/bench changed to match this new interface? for example, https://github.com/NVIDIA/TensorRT-LLM/blob/435cd2983db9327b9c4aac4f299e69d3c2954d34/tensorrt_llm/bench/benchmark/throughput.py#L230

Apr 01 '25 01:04 amukkara

@hchings should LLM() instantiations in tensorrt_llm/bench changed to match this new interface? for example,

https://github.com/NVIDIA/TensorRT-LLM/blob/435cd2983db9327b9c4aac4f299e69d3c2954d34/tensorrt_llm/bench/benchmark/throughput.py#L230

@amukkara The trtllm-bench shares the code path between trtllm TRT and PyT backends, which may be tricky to unify. The functional knobs touched in this MR will make the usage variant in both backends, and that is as expected. For instance, for PyT, it is required to use LLM(max_batch_size=...) and build_config is invalid; while for TRT, it is OK to use LLM(build_config=BuildConfig(max_batch_size=...)). In the long run, we will converge to the PyT path. cc @hchings to correct me if wrong. cc @FrankD412 for viz.

Apr 01 '25 03:04 Superjomn

@Superjomn distinction between TRT and PyT backend makes sense. In that case, there is one instance where BuildConfig is still used in PyT path and as a result max_seq_len cannot be correctly initialized. @FrankD412 I think the fix would be a one line change here

Apr 01 '25 05:04 amukkara