deepmind-research
deepmind-research copied to clipboard
Implementation Typo at NFNet?
Hi, should this line (https://github.com/deepmind/deepmind-research/blob/master/nfnets/nf_resnet.py#L181) be
out = self.activation(x / self.beta)
instead of
out = self.activation(x) * self.beta
Following the paper, h_{i+1} = h_{i} + \alpha * f_{i} ( h_{i} / beta ), thus the x should be first divided by beta instead with activation? Although these two implementations are the same when beta=1 and activation=relu, they are different in other cases.