alvanli
alvanli
Also want to add these two here - (MOCOv3) https://arxiv.org/pdf/2104.02057.pdf - (How to train your ViT) https://arxiv.org/pdf/2106.10270.pdf
EfficientViT: https://arxiv.org/pdf/2205.14756.pdf - we can leave this one for later, there's no official code implementation yet
thank you for the always very detailed explanation! I learned a lot reading these tickets 🤗
just tested `unsloth/Qwen2.5-VL-3B-Instruct` works, and doesn't need the data-collator casting