Dary
Results
3
issues of
Dary
According to what you wrote: _“That is, the output of each sub-layer is $\mathrm{LayerNorm}(x + \mathrm{Sublayer}(x))$, where $\mathrm{Sublayer}(x)$ is the function implemented by the sub-layer itself. We apply dropout [(cite)](http://jmlr.org/papers/v15/srivastava14a.html)...
请问是在哪个版本的tensorrtx基础上修改的呀?
就一直弹这个