Dexter Ju

Results 1 issues of Dexter Ju

A heads up for whom might be using Transformer encoder. The transformer Encoder Layer forward loop use to be: ``` tensor = tensor + self.dropout(self.attention(tensor, mask=mask)) tensor = _normalize(tensor, self.norm1)...

donotreap