why fp8_e4m3 min_scaling_factor divide 512?

Open suxi1314 opened this issue 1 year ago • 1 comments

https://github.com/NVIDIA/TensorRT-LLM/blob/main/cpp/tensorrt_llm/common/cudaFp8Utils.cu#L219 constexpr float min_scaling_factor = 1.0f / (FP8_E4M3_MAX * 512.f); why is it 512？

Jul 18 '24 08:07 suxi1314

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 15 days."

Aug 19 '24 01:08 github-actions[bot]

This issue was closed because it has been stalled for 15 days with no activity.

Sep 24 '24 02:09 github-actions[bot]