Rick Battle
Results
3
issues of
Rick Battle
While training, the amount of system RAM used scales linearly with the number of GPUs used. If training on 1 GPU takes 64GB of system RAM, then training on 3...
**Describe** Model: I use MiniLMv2 for a lot of tasks. DeBERTa can outperform both BERT and RoBERTa. Can you please distill MiniLMv2 from DeBERTa-Large? Thank you!