linear-attention topic
RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, sa...
Multi-Attention-Network
The semantic segmentation of remote sensing images
MAResU-Net
The semantic segmentation of remote sensing images
autoregressive-linear-attention-cuda
CUDA implementation of autoregressive linear attention, with all the latest research findings
taylor-series-linear-attention
Explorations into the recently proposed Taylor Series Linear Attention
agent-attention-pytorch
Implementation of Agent Attention in Pytorch
heinsen_attention
Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)
CARE-Transformer
CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction