Dan Fu

Results 103 comments of Dan Fu

Yes, this code path does not support sequence lengths longer than 8192 yet - will update this issue when the code is updated. For now, the slow version of fftconv...

Can you give more details on the workload you’re using to measure the speedup?

This is something we experience too - it makes it super annoying to debug... We're looking into it, but if anyone has suggestions we'd love to hear them!

CC @jonsaadfalcon Try the V1 models: https://huggingface.co/hazyresearch/M2-BERT-32K-Retrieval-Encoder-V1 Those have seen some legal data during training so hopefully they should work a bit better :) If they still don't work, would...

Yes, both queries and documents use the same protocol and model, there's no extra prompt.

Adding this line to the compiler flags may help: ``` -D_GLIBCXX_USE_CXX11_ABI=$(shell python3 -c "import torch; print(torch._C._GLIBCXX_USE_CXX11_ABI)" ``` In `setup.py`, these can be added as follows: ```python cxx11_abi = subprocess.check_output(['python', '-c',...

You should squeeze the kernel once before you pass it in, so the shape is (H, L). Then it should work! (All Hyena experiments in the paper were on a...

Interesting, this is a really great analysis! We also noticed this and have been working on an update to the benchmark (LoCoV1). We haven't put it out yet but will...

+1, would love to see the script @calpt! The scores a a good bit higher than when we ran BM25 internally so would love to see if we did something...

Great, we’ll take a look! CC @jonsaadfalcon On Thu, Feb 8, 2024 at 10:08 AM calpt ***@***.***> wrote: > Sure, I basically just took your loco_eval.py script, removed everything >...