Songlin Yang

Results 36 issues of Songlin Yang

I found log-bmm very useful for linear-chain CRF to save memory and speed up, while in context-free grammars, A->BC requires amounts of GPU memories, which is more serious. So it...

Hi, Thanks for your great work! I am trying to do `hg38/hg38_hyena_seqlen_warmup_reload.yaml` experiment. Got the following error msg: I had some initial search on this issue and found [this](https://github.com/HazyResearch/hyena-dna/issues/31). I...

### Checklist - [x] I have checked [FAQs](https://github.com/fla-org/flash-linear-attention/blob/main/FAQs.md) and existing issues for similar problems - [x] My GPU is H100 and I have installed `triton-nightly` built by fla team, and...

bug

### Checklist - [x] I have checked [FAQs](https://github.com/fla-org/flash-linear-attention/blob/main/FAQs.md) and existing issues for similar problems - [x] My GPU is H100 and I have installed `triton-nightly` built by fla team, and...

bug

### Feature Request Currently, FLA contains several suboptimal autotuning settings. We should avoid performing grid search over the full Cartesian product space, as it is inefficient and often unnecessary. ###...

enhancement

### Proposal Mamba2's GVA is useful ### Rationale _No response_

enhancement

### Proposal RWKV-6 and RWKV-7 currently do not support varlen training. We aim to develop varlen token shift kernels to enable this functionality. - [x] RWKV-6 and RWKV-7 varlen training...

enhancement

### Proposal The Hugging Face implementation lacks support for many features. It would be more convenient to integrate Mambas into the FLA ecosystem to enable functionalities like inference, varlen training,...

enhancement

### Proposal Use each model's official initialization instead of a unified initialization ### Rationale Related issue https://github.com/fla-org/flash-linear-attention/issues/220 , https://github.com/fla-org/flash-linear-attention/issues/266

enhancement

### Proposal as title ### Rationale _No response_

enhancement