Use fp32 inference precision
Changes
- Use fp32 inference precision
Reason for changes
Ref: 140438
Related tickets
Ref: 140438
@alexsu52 @KodiaqQ Are you guys okay with how this was implemented, or are any changes required? Please review, as I will start validation soon.
Should I use the fp32 hint here as well?
https://github.com/openvinotoolkit/nncf/blob/dfc5d78f76406403ae25c3bbbc7c5881cd8837f8/nncf/quantization/algorithms/weight_compression/openvino_backend.py#L348
https://github.com/openvinotoolkit/nncf/blob/dfc5d78f76406403ae25c3bbbc7c5881cd8837f8/nncf/quantization/algorithms/weight_compression/openvino_backend.py#L381
@alexsu52 @KodiaqQ Please review
Should I use the fp32 hint here as well?
https://github.com/openvinotoolkit/nncf/blob/dfc5d78f76406403ae25c3bbbc7c5881cd8837f8/nncf/quantization/algorithms/weight_compression/openvino_backend.py#L348
https://github.com/openvinotoolkit/nncf/blob/dfc5d78f76406403ae25c3bbbc7c5881cd8837f8/nncf/quantization/algorithms/weight_compression/openvino_backend.py#L381
Yes, you should use the fp32 hint there as well.
NNCF/manual/post_training_quantization: (develop) Build # 517 NNCF/manual/post_training_quantization: Build # 516