annotated_deep_learning_paper_implementations icon indicating copy to clipboard operation
annotated_deep_learning_paper_implementations copied to clipboard

Bug in Transformer-XL shift method

Open Bearnardd opened this issue 2 years ago • 1 comments

Hi! In the original paper implementation they are using dims [1:] : x = x_padded[1:].view_as(x) their code but in your implementation you are using [:-1]: x = x_padded[:-1].view_as(x) your code which produces wrong matrix at the output.

Bearnardd avatar May 16 '23 13:05 Bearnardd

from typing import Optional, List. is wrong

hacihasanzade avatar Jul 15 '23 13:07 hacihasanzade