Thirvin
Results
1
comments of
Thirvin
I train Infini-llama with arxiv-paper. The result is alike to yours. It can't handle the attention compressed in memory. Its outputs have little relation to the content I prvided.