nncf icon indicating copy to clipboard operation
nncf copied to clipboard

Synchronize quantizer setups for DistributedDataParallel cases

Open vshampor opened this issue 5 years ago • 3 comments

Now that the quantizer setup is being decided during create_compressed_model, and for precision init cases the resulting setup is dependent on the data loaders used for initialization, there is a possibility for DDP that each process may receive significantly different data values, and then compute a different quantizer setup each; since the entire quantizer setup is not technically a torch.Tensor, it cannot be broadcasted to all processes using PyTorch facilities. A special tensor-only synchronization object is required so that the precision init (determining the quantizer setup) only happens in one process of the DDP group, and then the resulting quantizer setup is broadcasted to other processes in the group.

vshampor avatar Jan 22 '21 09:01 vshampor

Hi @vshampor , was this implemented in nncf?

fxmarty avatar Apr 18 '23 11:04 fxmarty