Bjarke Dalhoff Christensen
Results
3
comments of
Bjarke Dalhoff Christensen
@Hamsss Did you have any success in adapting it to non-shared parameters. @tk-rusch Is this a requirement for the Gradient Gating framework to work? Sorry if this is a novice...
@tk-rusch Thank you for the swift reply and example.
@tk-rusch Just to be clear, I can have multiple layers that do not share weights, and I can apply a gradient gate to each of these layers. Is that correctly...