heretic1993

Results 1 issues of heretic1993

it seems that the transformer implemented is just a binary classification model as other models. The implementation doesn't seem to follow the original paper in which positional encoding is employed.