Sorobot - Chengyu Zhang
Sorobot - Chengyu Zhang
I'm new to this field, and would greatly appreciate it if you could upload some follow-up content!
Is the **lightweight** model mentioned in the article completed according to the training configuration described in the article? If not, Any additional operations are required? Thank you so much for...
The forward_feature method has an explosive number of values(for depths=6 and in the latter layers).
I tried outputting the output of each RSTB block and found that it wasn't scaled correctly, resulting in severe numerical explosion. At float16 precision, it even caused an overflow error;...