sparseml
sparseml copied to clipboard
ONNX export for weights-only quantization
This PR adds a transformation that quantizes weights for weights-only quantization. It was tested on a Llama2 model.