ONNX export for weights-only quantization

Open anmarques opened this issue 2 years ago • 0 comments

This PR adds a transformation that quantizes weights for weights-only quantization. It was tested on a Llama2 model.

Dec 19 '23 14:12 anmarques