rebased icon indicating copy to clipboard operation
rebased copied to clipboard

Official implementation of the paper "Linear Transformers with Learnable Kernel Functions are Better In-Context Models"

Results 4 rebased issues
Sort by recently updated
recently updated
newest added

Consider we have a LLM, which had been pretrained with quadratic attention, and we want to extend its context size/improve performance. And for this purpose we only swap the attention...

Based architecture seems to have been updated - https://arxiv.org/abs/2402.18668. Any insights into how it compares with Rebased?

Hello! The concept is awesome, and it would be nice to integrate it into the huggingface/transformers library. However, to ensure that everything works correctly and matches the paper results, we...

Hi, I read your paper and found the following confusing. When you're describing your ablations which culminate in ReBased it starts with > x^2 – substituting the original kernel function...