I broke it
how did u break it? @CoderUnkn0wn
You can see in the image all of the settings I used. Then I just let it run and suddenly everything was NaN.
coooll
this is not an issue, it is normal behavior.
Second Witness
"Breaking" the model by causing the weights to blow up to infinity isn't difficult to do. Setting the learning rate high at the start of a complex model causes the weights to explode to high values, evaluate to Infinity, and then the next epoch reports them all as "NaN."
Discussion
I think it is normal behavior— it's just the way the math works out. Setting the learning rate so high is bound to give inaccurate results.
This kind of playground is the perfect place to be able to play with settings like that and experience how the models behave when given diverse kinds of inputs. The researcher student will quickly learn to be more intentional about the use of a high learning rate!
Minimal Reproduction
I was able to reproduce the behavior previously reported by doing the following:
- Set the learning rate to the maximum
- Max out the number of hidden layers and hidden nodes
- Set the problem type to "regression"
- Step through the training one epoch at a time
- Discover the values overflowing within 14 training epochs