Dan Fu comments

Results 103 comments of


                                            Dan Fu

RuntimeError: Expected fft_size >= 16 && fft_size <= 16384

Yes, this code path does not support sequence lengths longer than 8192 yet - will update this issue when the code is updated. For now, the slow version of fftconv...

RuntimeError: Expected fft_size >= 16 && fft_size <= 16384

Can you give more details on the workload you’re using to measure the speedup?

Long latency when loading fft_conv kernel for the first time

This is something we experience too - it makes it super annoying to debug... We're looking into it, but if anyone has suggestions we'd love to hear them!

I am not getting relevant results with m2-bert-80M-32k-retrieval

CC @jonsaadfalcon Try the V1 models: https://huggingface.co/hazyresearch/M2-BERT-32K-Retrieval-Encoder-V1 Those have seen some legal data during training so hopefully they should work a bit better :) If they still don't work, would...

I am not getting relevant results with m2-bert-80M-32k-retrieval

Yes, both queries and documents use the same protocol and model, there's no extra prompt.

New binaries release needed for PyTorch 2.7.0 (torch2.7.0cu128 / torch2.6.0cu126 + flash_attn-2.7.4.post1 seem broken because PyTorch changed ABI)

Adding this line to the compiler flags may help: ``` -D_GLIBCXX_USE_CXX11_ABI=$(shell python3 -c "import torch; print(torch._C._GLIBCXX_USE_CXX11_ABI)" ``` In `setup.py`, these can be added as follows: ```python cxx11_abi = subprocess.check_output(['python', '-c',...

Dan Fu

RuntimeError: Expected fft_size >= 16 && fft_size <= 16384

RuntimeError: Expected fft_size >= 16 && fft_size <= 16384

Long latency when loading fft_conv kernel for the first time

I am not getting relevant results with m2-bert-80M-32k-retrieval

I am not getting relevant results with m2-bert-80M-32k-retrieval

New binaries release needed for PyTorch 2.7.0 (torch2.7.0cu128 / torch2.6.0cu126 + flash_attn-2.7.4.post1 seem broken because PyTorch changed ABI)

Training Hyena-based Models with FlashFFTConv + Safari

LoCo Benchmark - BM25 & Insights

LoCo Benchmark - BM25 & Insights

LoCo Benchmark - BM25 & Insights