TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

feat: allreduce and fusion kernel development

Open yilin-void opened this issue 10 months ago • 1 comments

Currently, it is still in the draft stage. The completed parts are:

  • Fixed the sync error in the twoshot sync kernel.
  • Removed the poorly performing oneshot sync kernel.
  • Added support for FP32 data type in the existing kernel (FP4Quant fusion is not supported in this case).
  • Added support for FP8Quant.
  • Added support for non-fusion.
  • Added support for pre-hopper architecture.

Todo:

  • Add new test cases to the C++ unit tests and adapt the old test cases to the latest changes.
  • Adapt the corresponding Torch OP to the latest changes.

yilin-void avatar Mar 25 '25 10:03 yilin-void

  • @Kefeng-Duan @zongfeijing for vis about this MR.

juney-nvidia avatar Mar 25 '25 11:03 juney-nvidia

/bot run --disable-fail-fast

yilin-void avatar Apr 01 '25 09:04 yilin-void

PR_Github #882 [ run ] triggered by Bot

tensorrt-cicd avatar Apr 01 '25 09:04 tensorrt-cicd

PR_Github #882 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #698 completed with status: 'FAILURE'

tensorrt-cicd avatar Apr 01 '25 11:04 tensorrt-cicd

/bot run --disable-fail-fast

yilin-void avatar Apr 01 '25 11:04 yilin-void

PR_Github #900 [ run ] triggered by Bot

tensorrt-cicd avatar Apr 01 '25 11:04 tensorrt-cicd

PR_Github #900 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #708 completed with status: 'SUCCESS'

tensorrt-cicd avatar Apr 01 '25 13:04 tensorrt-cicd

/bot run

yilin-void avatar Apr 02 '25 03:04 yilin-void

PR_Github #965 [ run ] triggered by Bot

tensorrt-cicd avatar Apr 02 '25 03:04 tensorrt-cicd

PR_Github #965 [ run ] completed with state FAILURE /LLM/main/L0_MergeRequest_PR pipeline #752 completed with status: 'FAILURE'

tensorrt-cicd avatar Apr 02 '25 03:04 tensorrt-cicd

/bot run --disable-fail-fast

yilin-void avatar Apr 02 '25 07:04 yilin-void

PR_Github #988 [ run ] triggered by Bot

tensorrt-cicd avatar Apr 02 '25 07:04 tensorrt-cicd

PR_Github #988 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #767 completed with status: 'FAILURE'

tensorrt-cicd avatar Apr 02 '25 11:04 tensorrt-cicd

/bot run

yilin-void avatar Apr 03 '25 01:04 yilin-void

PR_Github #1050 [ run ] triggered by Bot

tensorrt-cicd avatar Apr 03 '25 01:04 tensorrt-cicd

PR_Github #1050 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #807 completed with status: 'FAILURE'

tensorrt-cicd avatar Apr 03 '25 12:04 tensorrt-cicd

/bot run --disable-fail-fast

yilin-void avatar Apr 08 '25 02:04 yilin-void

PR_Github #1384 [ run ] triggered by Bot

tensorrt-cicd avatar Apr 08 '25 02:04 tensorrt-cicd

PR_Github #1384 [ run ] completed with state FAILURE /LLM/main/L0_MergeRequest_PR pipeline #1040 completed with status: 'FAILURE'

tensorrt-cicd avatar Apr 08 '25 02:04 tensorrt-cicd

/bot run --disable-fail-fast

yilin-void avatar Apr 08 '25 02:04 yilin-void

PR_Github #1390 [ run ] triggered by Bot

tensorrt-cicd avatar Apr 08 '25 03:04 tensorrt-cicd

PR_Github #1390 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #1044 completed with status: 'SUCCESS'

tensorrt-cicd avatar Apr 08 '25 11:04 tensorrt-cicd

/bot reuse-pipeline

yilin-void avatar Apr 08 '25 11:04 yilin-void

/bot reuse-pipeline

hyukn avatar Apr 08 '25 11:04 hyukn

PR_Github #1451 [ reuse-pipeline ] triggered by Bot

tensorrt-cicd avatar Apr 08 '25 11:04 tensorrt-cicd

/bot reuse-pipeline

hyukn avatar Apr 08 '25 11:04 hyukn

PR_Github #1452 [ reuse-pipeline ] triggered by Bot

tensorrt-cicd avatar Apr 08 '25 11:04 tensorrt-cicd

PR_Github #1452 [ reuse-pipeline ] completed with state ABORTED Can't reuse PR_Github #0 with status: UNKNOWN

tensorrt-cicd avatar Apr 08 '25 11:04 tensorrt-cicd

/bot reuse-pipeline

hyukn avatar Apr 08 '25 11:04 hyukn

PR_Github #1454 [ reuse-pipeline ] triggered by Bot

tensorrt-cicd avatar Apr 08 '25 11:04 tensorrt-cicd