Dary

Results 3 issues of Dary

According to what you wrote: _“That is, the output of each sub-layer is $\mathrm{LayerNorm}(x + \mathrm{Sublayer}(x))$, where $\mathrm{Sublayer}(x)$ is the function implemented by the sub-layer itself. We apply dropout [(cite)](http://jmlr.org/papers/v15/srivastava14a.html)...

请问是在哪个版本的tensorrtx基础上修改的呀?