Bryan Li
Results
2
issues of
Bryan Li
I have two questions about the key and value calculation in Attention (and similarly for KNNAttention). The relevant line is: https://github.com/lucidrains/memorizing-transformers-pytorch/blob/83fa1479d6f7881dd977fbff55681e709e3b250e/memorizing_transformers_pytorch/memorizing_transformers_pytorch.py#L135 1. Why is there only one Linear layer `to_kv`,...
I have a somewhat silly use case. I'm running `retriever.retrieve`, with a `queries` dict with only 1 entry. However, this causes an IndexError with pytorch due to how pytorch indexes...