南雍山野猪骑士
南雍山野猪骑士
From the observation of training results, the hard mask's weights between the constrained layers are not exactly aligned. https://github.com/MingSun-Tse/ASSL/blob/a564556c8b578c2ee86d135044f088bfeaafc707/src/pruner/utils.py#L71
In PyTorch >= 1.8, this `register_backward_hook` can't be used directly, and `register_full_backward_hook` always return "using input grad before output grad bla...". Did you know how to replace `register_backward_hook` in higher...
Hi, @MingSun-Tse @yulunzhang ,we are all interesting in your this work, but I also meet reproduce problems. We using this code and training settings, but can't reproduce same results in...
### System Info ``` File "/home/xxx/anaconda3/envs/xxx/lib/python3.11/site-packages/accelerate/state.py", line 211, in __init__ torch.cuda.set_device(self.device) File "/home/xxx/anaconda3/envs/xxx/lib/python3.11/site-packages/torch/cuda/__init__.py", line 350, in set_device torch._C._cuda_setDevice(device) RuntimeError: CUDA error: invalid device ordinal CUDA kernel errors might be asynchronously...