saint
saint copied to clipboard
MLP is equivalent to a single linear layer
One more issue - your implementation of MLP, in model.py, is just a bunch of stacked linear layers with no non-linearities between them. This is mathematically equivalent to just a linear layer with in_dim = dims[0] and out_dim=dims[-1].
Why not use the activation between the layers?