implicit-hyper-opt issues

Scaling the hessian by learning rate

7

Thank you for releasing the code! I'm trying to apply this method to train language models where I partition the parameters and setup a bilevel problem where the outer parameters...

zhichul

How to calculate d_train_loss_d_w

6

Hi Thanks for sharing the code! I have a question on how to calculate the d l_train/ d w. Shall we use all trianing samples or at leat a few...

Shancong-Mou

Could you upload demo of data-augment? I can't reproduce the result on the paper.

I can not reproduce the result of the "data augment" experience on the paper. could you upload a better version of the "data augmentation" experience that contains the setting of...

Asber777

Cannot load checkpoints that script depends on

3

Running `python train_augment_net2.py --use_augment_net` as suggested by the README results in: ``` Could not open finetuned_checkpoints/dataset:mnist_datasize:1600_hyperparam:weightDecayGlobal_seed:1.pkl sailhome/motiwari/anaconda3/envs/ift-env/lib/python3.7/site-packages/numpy/core/fromnumeric.py:3420: RuntimeWarning: Mean of empty slice. out=out, **kwargs) /sailhome/motiwari/anaconda3/envs/ift-env/lib/python3.7/site-packages/numpy/core/_methods.py:188: RuntimeWarning: invalid value encountered in...

motiwari

RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.

when running train.py in /rnn, RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.

pipilurj

neumann_hyperstep_preconditioner

2

Hi, I'm confused by the function neumann_hyperstep_preconditioner since I found two versions of it. One in the rnn/train.py computes the hessian_term as : `hessian_term = (counter.view(1, -1) @ d_train_loss_d_w.view(-1, 1)...

killandy

Cannot set attribute `train_data` etc. of datasets

When executing the code as-is, pytorch throws an error when trying to subsample the datasets according to the command line parameter. The attribute needs to be changed from `train_data` to...

motiwari

Referenced `./getdata.sh` is not included in repo

motiwari

Error in `Simple Test` command

```python mnist_test.py --datasize 40000 --valsize 10000 --lrh 0.01 --epochs=100 --hepochs=10 --l2=1e-5 --restart=10 --model=mlp --dataset=mnist --num_layers=1 --hessian=kfac --jacobian=direct``` should be ```python mnist_test.py --datasize 40000 --valsize 10000 --lrh 0.01 --epochs=100 --hepochs=10 --l2=1e-5...

motiwari

Referenced `run_all.sh` is not included in repo

motiwari

implicit-hyper-opt
implicit-hyper-opt copied to clipboard

Metadata

Scaling the hessian by learning rate

How to calculate d_train_loss_d_w

Could you upload demo of data-augment? I can't reproduce the result on the paper.

Cannot load checkpoints that script depends on

RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.

neumann_hyperstep_preconditioner

Cannot set attribute `train_data` etc. of datasets

Referenced `./getdata.sh` is not included in repo

Error in `Simple Test` command

Referenced `run_all.sh` is not included in repo

← Metadata

Owner

Metadata

implicit-hyper-opt implicit-hyper-opt copied to clipboard

Metadata

← Metadata

Owner

Metadata

implicit-hyper-opt
implicit-hyper-opt copied to clipboard