nncf Use fp32 inference precision

Changes

Use fp32 inference precision

Reason for changes

Ref: 140438

Related tickets

Ref: 140438

Oct 16 '24 11:10 andrey-churkin

@alexsu52 @KodiaqQ Are you guys okay with how this was implemented, or are any changes required? Please review, as I will start validation soon.

Oct 16 '24 11:10 andrey-churkin

Should I use the fp32 hint here as well?

https://github.com/openvinotoolkit/nncf/blob/dfc5d78f76406403ae25c3bbbc7c5881cd8837f8/nncf/quantization/algorithms/weight_compression/openvino_backend.py#L348

https://github.com/openvinotoolkit/nncf/blob/dfc5d78f76406403ae25c3bbbc7c5881cd8837f8/nncf/quantization/algorithms/weight_compression/openvino_backend.py#L381

Oct 18 '24 10:10 andrey-churkin

@alexsu52 @KodiaqQ Please review

Oct 18 '24 10:10 andrey-churkin

Should I use the fp32 hint here as well?

https://github.com/openvinotoolkit/nncf/blob/dfc5d78f76406403ae25c3bbbc7c5881cd8837f8/nncf/quantization/algorithms/weight_compression/openvino_backend.py#L348

https://github.com/openvinotoolkit/nncf/blob/dfc5d78f76406403ae25c3bbbc7c5881cd8837f8/nncf/quantization/algorithms/weight_compression/openvino_backend.py#L381

Yes, you should use the fp32 hint there as well.

Oct 21 '24 10:10 alexsu52

NNCF/manual/post_training_quantization: (develop) Build # 517 NNCF/manual/post_training_quantization: Build # 516

Oct 25 '24 06:10 andrey-churkin