chore: [TRTLLM-3694] Move functional args to llmargs
Moving args shared across TRT & Torch backend out from BuildConfig to first-level LLMArgs as discussed w/ Chunwei.
/bot run
PR_Github #325 [ run ] triggered by Bot
PR_Github #325 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #306 completed with status: 'FAILURE'
/bot run
PR_Github #332 [ run ] triggered by Bot
PR_Github #332 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #312 completed with status: 'SUCCESS'
/bot run
PR_Github #528 [ run ] triggered by Bot
PR_Github #528 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #451 completed with status: 'FAILURE'
/bot run
PR_Github #619 [ run ] triggered by Bot
PR_Github #619 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #521 completed with status: 'FAILURE'
/bot run
PR_Github #633 [ run ] triggered by Bot
PR_Github #633 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #532 completed with status: 'FAILURE'
/bot run
PR_Github #642 [ run ] triggered by Bot
PR_Github #642 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #542 completed with status: 'FAILURE'
/bot run
PR_Github #662 [ run ] triggered by Bot
PR_Github #662 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #557 completed with status: 'FAILURE'
/bot run
PR_Github #671 [ run ] triggered by Bot
PR_Github #671 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #564 completed with status: 'SUCCESS'
/bot reuse-pipeline
PR_Github #682 [ reuse-pipeline ] triggered by Bot
PR_Github #682 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #671 for commit e9b9826
@hchings should LLM() instantiations in tensorrt_llm/bench changed to match this new interface?
for example, https://github.com/NVIDIA/TensorRT-LLM/blob/435cd2983db9327b9c4aac4f299e69d3c2954d34/tensorrt_llm/bench/benchmark/throughput.py#L230
@hchings should
LLM()instantiations intensorrt_llm/benchchanged to match this new interface? for example,https://github.com/NVIDIA/TensorRT-LLM/blob/435cd2983db9327b9c4aac4f299e69d3c2954d34/tensorrt_llm/bench/benchmark/throughput.py#L230
@amukkara The trtllm-bench shares the code path between trtllm TRT and PyT backends, which may be tricky to unify. The functional knobs touched in this MR will make the usage variant in both backends, and that is as expected. For instance, for PyT, it is required to use LLM(max_batch_size=...) and build_config is invalid; while for TRT, it is OK to use LLM(build_config=BuildConfig(max_batch_size=...)). In the long run, we will converge to the PyT path. cc @hchings to correct me if wrong. cc @FrankD412 for viz.
@Superjomn distinction between TRT and PyT backend makes sense. In that case, there is one instance where BuildConfig is still used in PyT path and as a result max_seq_len cannot be correctly initialized. @FrankD412 I think the fix would be a one line change here