TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

perf: Optimisations for PP + attention DP

Open amukkara opened this issue 10 months ago • 3 comments

  1. Remove MPI world broadcast in fetch_adp_new_requests
  2. Sync request finish point for last and intermediate pp ranks to avoid deadlock in trtllm-bench runs.

amukkara avatar Mar 28 '25 01:03 amukkara

/bot run

amukkara avatar Mar 28 '25 03:03 amukkara

PR_Github #667 [ run ] triggered by Bot

tensorrt-cicd avatar Mar 28 '25 03:03 tensorrt-cicd

PR_Github #667 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #561 completed with status: 'SUCCESS'

tensorrt-cicd avatar Mar 28 '25 05:03 tensorrt-cicd

/bot run

amukkara avatar Mar 31 '25 20:03 amukkara

PR_Github #800 [ run ] triggered by Bot

tensorrt-cicd avatar Mar 31 '25 20:03 tensorrt-cicd

PR_Github #800 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #646 completed with status: 'FAILURE'

tensorrt-cicd avatar Mar 31 '25 21:03 tensorrt-cicd

/bot run

amukkara avatar Mar 31 '25 21:03 amukkara

PR_Github #802 [ run ] triggered by Bot

tensorrt-cicd avatar Mar 31 '25 21:03 tensorrt-cicd

PR_Github #802 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #648 completed with status: 'SUCCESS'

tensorrt-cicd avatar Mar 31 '25 23:03 tensorrt-cicd

/bot reuse-pipeline

amukkara avatar Apr 01 '25 00:04 amukkara

PR_Github #813 [ reuse-pipeline ] triggered by Bot

tensorrt-cicd avatar Apr 01 '25 00:04 tensorrt-cicd

PR_Github #813 [ reuse-pipeline ] completed with state SUCCESS Reusing PR_Github #802 for commit b901251

tensorrt-cicd avatar Apr 01 '25 00:04 tensorrt-cicd