Joe Young
Joe Young
Hi guys, Now I can fine-tune 'visionbranch_stage2_finetune.yaml' on **four** A100 80GB GPUs using gradient accumulation. I'd like to know at what point the Loss is considered to have converged? For...
Hi @brandonwagstaff, The code use transposed equation of WLS to solve the normal vector. Why !? I guess that you want to reduce computation of transpose.   Thanks !!
Hi guys, Thank you for your excellent work. I would like to know which ADAS video datasets you used for training. Thanks !!
Hi DeepSpeed teams, Thank you for your great work! As the title suggests, the "01-ai/Yi-34B-Chat" model cannot run properly with DeepSpeed-MII version 0.2.3. The encountered error message is as follows:...