ScaleLLM
ScaleLLM copied to clipboard
Quantization: Supporting FP8 for both models and KV caches