Pankaj Mathur comments

Repositories
Issues
Comments

Results 2 comments of


                                            Pankaj Mathur

FlashAttention support?

Same boat here, I try testing by trying both versions of flash attention individually using monkey patching code of FastChat https://github.com/lm-sys/FastChat/blob/main/fastchat/train/llama_flash_attn_monkey_patch.py but got stuck on similar error which @ehartford reported...

FlashAttention support?

Yup I tried that already it’s throws same error for Llama2 70b. It’s same monkey patch code from FastChat which I tried to integrate.