Serkan Sulun

Results 9 comments of Serkan Sulun

I made some other changes before so the code isn't very clean but I'll do that if I can find some time after I have a final version for myself....

I have fixed it but I also switched to Pytorch in the meantime, it is available in my repositories if anyone needs it.

Does this directly replace the `sinusoid` function?

Looks like the answer is yes, but it doesn't matter much since it's only run one time, before training begins.

Bump. Can someone explain the usage of `self.E = torch.randn([self.max_seq, int(self.dh)], requires_grad=False)` while calculating relative attention? Also, this parameter isn't registered so it prevents reproducibility when model is reloaded.

Dear Luca, Can you release the PyTorch version? I'd be happy to work on it.

[https://github.com/kimiyoung/transformer-xl/issues/8#issuecomment-455187360](url) > For position embedding, the two columns are equivalent, simply because they are consumed by the matrix multiplication which is permutation-invariant.

```python import h5py path = 'Emotion6 video dataset/multi_train_data.mat' with h5py.File(path, 'r') as file: data = file['train_data'][:] ``` Resulting array has a shape of `(4096, 30, 2400)`. It looks like processed...

```python import h5py path = 'Emotion6 video dataset/multi_train_data.mat' with h5py.File(path, 'r') as file: data = file['train_data'][:] ``` Resulting array has a shape of `(4096, 30, 2400)`. It looks like processed...