implicit-hyper-opt
implicit-hyper-opt copied to clipboard
Thank you for releasing the code! I'm trying to apply this method to train language models where I partition the parameters and setup a bilevel problem where the outer parameters...
Hi Thanks for sharing the code! I have a question on how to calculate the d l_train/ d w. Shall we use all trianing samples or at leat a few...
I can not reproduce the result of the "data augment" experience on the paper. could you upload a better version of the "data augmentation" experience that contains the setting of...
Running `python train_augment_net2.py --use_augment_net` as suggested by the README results in: ``` Could not open finetuned_checkpoints/dataset:mnist_datasize:1600_hyperparam:weightDecayGlobal_seed:1.pkl sailhome/motiwari/anaconda3/envs/ift-env/lib/python3.7/site-packages/numpy/core/fromnumeric.py:3420: RuntimeWarning: Mean of empty slice. out=out, **kwargs) /sailhome/motiwari/anaconda3/envs/ift-env/lib/python3.7/site-packages/numpy/core/_methods.py:188: RuntimeWarning: invalid value encountered in...
when running train.py in /rnn, RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.
Hi, I'm confused by the function neumann_hyperstep_preconditioner since I found two versions of it. One in the rnn/train.py computes the hessian_term as : `hessian_term = (counter.view(1, -1) @ d_train_loss_d_w.view(-1, 1)...
When executing the code as-is, pytorch throws an error when trying to subsample the datasets according to the command line parameter. The attribute needs to be changed from `train_data` to...
```python mnist_test.py --datasize 40000 --valsize 10000 --lrh 0.01 --epochs=100 --hepochs=10 --l2=1e-5 --restart=10 --model=mlp --dataset=mnist --num_layers=1 --hessian=kfac --jacobian=direct``` should be ```python mnist_test.py --datasize 40000 --valsize 10000 --lrh 0.01 --epochs=100 --hepochs=10 --l2=1e-5...