Jeremy-lf
Jeremy-lf
when i run python run.py train --path_prefix /home/ccf_disk/dataset/imagenet --train_info train.txt --num_epochs 50 ,it happened above, what's wrong?
Batch Number: 182 of 196, Top-1 Hit: 353, Top-5 Hit: 880, Loss 27.7487, Top-1 Accuracy: 0.0075, Top-5 Accuracy: 0.0188 Batch Number: 183 of 196, Top-1 Hit: 353, Top-5 Hit: 881,...
使用昇腾910B训练VIMER-UFO大模型,在训练几个epoch后会稳定报这个错,如何解决? 如流联系:lvfeng02
### Question when i train pretraining, i meet the following probelm, there is no obvious tips, how should i solve it?  ENVS: A800*8, cuda11.6 my train script like this:...