cutlass icon indicating copy to clipboard operation
cutlass copied to clipboard

[FEA] FP8 GEMM implementation

Open jianyuh opened this issue 3 years ago • 5 comments

Is your feature request related to a problem? Please describe. Recently more details about Nvidia's latest H100 GPU are released in https://developer.nvidia.com/blog/nvidia-hopper-architecture-in-depth/ . Tensor Core will support FP8 E4M3 and E5M2 format. Wonder if CUTLASS is going to provide the FP8 GEMM implementation soon.

Describe the solution you'd like Ideally we want FP8 GEMM implementation with FP16/FP32 accumulate for both E4M3 and E5M2 format.

Describe alternatives you've considered This is new feature on new Nvidia hardware.

Additional context None.

jianyuh avatar Apr 03 '22 07:04 jianyuh

Yes, after the toolkit supports it.

hwu36 avatar Apr 04 '22 01:04 hwu36

@jianyuh is your interest in dense, sparse, or both?

mnicely avatar Apr 28 '22 13:04 mnicely

@mnicely Thanks for checking. We are mostly interested in dense for the 1st step.

jianyuh avatar Apr 30 '22 22:04 jianyuh

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

github-actions[bot] avatar May 30 '22 23:05 github-actions[bot]

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

github-actions[bot] avatar Sep 14 '22 16:09 github-actions[bot]