Bad loss spike when training nerfacto with high number of iterations
Describe the bug I'm training nerfacto on my dataset with 300k iterations (since I want to keep the original data resolution and it's 10 images 6kx4k). I know the standard is training for 30k but I could not understand the bad loss spike when training longer. Is this an implementation or configuration issue? Can you guys give me a pointer to solve this?
This seems like a numerical stability issue, which often happens if you train for too long. It's hard to tell without digging into the network + activations + etc, but you could try tuning the learning or weight decay parameters? (you can search ns-train nerfacto --help for lr and weight-decay)
Some small amount of weight decay (1e-4? 1e-3?) in particular might help with stability. But regularization might also hurt PSNR.