Model
What model architectures have you used to encode text and images?
Hi, thanks for your interesting for this work. The distillation model is a small vit and a small transformer. These model architecture is similar to the CLIP original model(eg. vit32-B).
Btw, the diatillation code is in the master branch. And the code about app is in the main branch.
You can checkout the master branch to check the implementation details.
Thanks!
Hello, can you share the full version of the code to start the model distillation process on a computer (for experiments), I would be very grateful if you could send it by email [email protected]
Sure, but it will take me some time to recall the code and I will reorganize a new version of the code in this repository.
Thank you so much, I will be waiting!!!
Now you can check the new version code! If you have any further questions, please feel free to consult 👏.