TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

why fp8_e4m3 min_scaling_factor divide 512?

Open suxi1314 opened this issue 1 year ago • 1 comments

https://github.com/NVIDIA/TensorRT-LLM/blob/main/cpp/tensorrt_llm/common/cudaFp8Utils.cu#L219 constexpr float min_scaling_factor = 1.0f / (FP8_E4M3_MAX * 512.f); why is it 512?

suxi1314 avatar Jul 18 '24 08:07 suxi1314

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 15 days."

github-actions[bot] avatar Aug 19 '24 01:08 github-actions[bot]

This issue was closed because it has been stalled for 15 days with no activity.

github-actions[bot] avatar Sep 24 '24 02:09 github-actions[bot]