Mohamed Mekkouri
Mohamed Mekkouri
Hi @Hongjie1Chu, I tried running your code with the current transformers & accelerate versions, but I run into the error : ``` File "~/miniconda3/envs/dev/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 211, in forward freqs =...
Yes it's Hopper, thank you !
Hey @AaronZLT! I think this would be complicated but definitely possible, because we need to combine : `Bnb4BitHfQuantizer` and `Bnb8BitHfQuantizer`! Would you be open to submitting a PR?
cc @SunMarc for review ! Thank you !
Thanks @stevhliu for this huge work ! I only added some few nits
Hi @romitjain I'm working on a pr to make the kernel function mapping easier to do so we don't have to use new functions like `lazy_load_mamba_ssm` in the modeling files....
sure here is the pr : https://github.com/huggingface/transformers/pull/41577
Yes, i did that for falcon models here : https://github.com/huggingface/transformers/pull/41664, you can do that for bamba models then using the same API, however the mamba-ssm kernel needs to be fixed...
Hi @romitjain, nothing to do on your side for now. I need to refactor the kernel function calls to make them less verbose and less visually intrusive.
Thanks @romitjain ! We are currently working on adding a decorator to use function kernels from the hub. Once it's ready we can integrate it. Will let you know