Bug report
At /src/models/ gain.py
86 gradient = logits * labels_ohe
87 grad_logits = (logits * labels_ohe).sum()
88 grad_logits.backward(gradient=gradient, retain_graph=True)
Line 88 shows an error:
File "C:\ProgramData\Anaconda2\envs\python3\lib\site-packages\torch\autograd_init_.py", line 90, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: invalid gradient at index 0 - expected shape [] but got [10, 2]
If I modified Line 88 as: grad_logits.backward(gradient= grad_logits, retain_graph=True) , then everything works well. Basically, I just replace gradient by grad_logits. Is this a typo in your original commit?
@ngxbac and @guopengf 87 grad_logits = (logits * labels_ohe).sum() to 87 grad_logits = (logits * labels_ohe)
also solves this issue, any idea whats the difference in these two approaches?
The backward_features are of the shape BSx512x7x7 in both the cases.