andreyanufr
andreyanufr
### Changes Added scale estimation for compression which minimizes L2 error between original MatMul and compressed one. ### Reason for changes Increases accuracy for compressed to 4 bit models. ###...
### Changes Updated statistics computation for AWQ algorithm. ### Reason for changes More stable approach. ### Related tickets ### Tests
### Changes Extended AWQ algorithms for patterns Act->MatMul and Act->Multiply->MatMul with insertion for extra scales after activation. ### Reason for changes Support AWQ for wider family of LLMs ### Related...
### Changes Fixed bug with clamp ranges in scale estimation. ### Reason for changes bug ### Related tickets ### Tests Unit test.
### Changes Implementation of compression to fixed codebook (LUT) values . ### Reason for changes CVS-167084 ### Related tickets CVS-167084 ### Tests tests/openvino/native/quantization/test_weights_compression.py
### Changes Implemented computation of codebook based on k-means algorithm. ### Reason for changes ### Related tickets CVS-169609 ### Tests