YingzhePeng
YingzhePeng
> You can find a DDP training example here: > > https://github.com/webdataset/webdataset-imagenet > > At this point, there are several ways of dealing with DDP training: > > (1) use...
> > If resample=True will lead to the different device get the same data? > > It should. The RNGs are initialized differently on each host and worker. > >...
Hi, thanks for your interesting for this work. The distillation model is a small vit and a small transformer. These model architecture is similar to the CLIP original model(eg. vit32-B)....
Sure, but it will take me some time to recall the code and I will reorganize a new version of the code in this repository.
Now you can check the new version code! If you have any further questions, please feel free to consult 👏.