ZOUHAN1
Results
1
issues of
ZOUHAN1
I would like to understand why masking is used in the text encoder. This doesn't seem necessary for CLIP since it does not perform an autoregressive task. Maybe my understanding...