mixtral-offloading icon indicating copy to clipboard operation
mixtral-offloading copied to clipboard

Run Mixtral-8x7B models in Colab or consumer desktops

Results 29 mixtral-offloading issues
Sort by recently updated
recently updated
newest added

Instead of manually iterating through each key in del_keys to delete them from the meta dictionary, use the pop() method to remove these keys if they exist The pop() method...

Hi, Have you guys managed to make it works on T4 colab? P.S. It crashes multiple times even with `offload_per_layer = 5` as mentioned in the comment.

using exl2 2.4 you can run mixtral on colab, did you give it a try ?

I'm a bit lost with the different quantization approaches such as GGUF, ExLlamaV2 & this project? Is it the same thing? Is one approach faster? GGUF: [TheBloke/Mixtral-8x7B-v0.1-GGUF](https://huggingface.co/TheBloke/Mixtral-8x7B-v0.1-GGUF) ExLlamaV2: [turboderp/Mixtral-8x7B-instruct-exl2](https://huggingface.co/turboderp/Mixtral-8x7B-instruct-exl2)

This PR adds a small CLI interface to the repository which makes local usage easy.

Dear Mixtral Offloading Contributors, I hope this message finds you well. I have been thoroughly engrossed in the intricacies of your project and commend the strides you have made in...

Hi there, Just wondering is it possible to fine tune this model on a custom dataset? If so, are there any examples/code? Many thanks for any help, and for this...

Is it possible to use `mixtral-offloading` with `llamaindex` to construct a RAG? If yes, do you have an example?

Thanks for your contributions. I would like to know whether it can be deployed on multi-GPU to allow the use of more VRAM?