Results 4 comments of R Zach

Is there anyone who could help me out with this?

I saw something similar, fwiw -- exploding gradients in the gradient rescaling from the very first forward pass. I read in other threads online that this is somewhat common in...

Could this be a result of the [TalkPython 100 days of code course](https://training.talkpython.fm/courses/explore_100days_in_python/100-days-of-code-in-python)? the authors reference this repo (or at least, [this transcript page indicates they do](https://training.talkpython.fm/courses/transcript/100-days-of-code-in-python/lecture/161402), I don't have...