exo 🚀 DeepSeek-v3-0324-8bit

The MLX community just dropped the 8-bit quantized version of DeepSeek-V3-0324! 👉 https://huggingface.co/mlx-community/DeepSeek-v3-0324-8bit Need urgent support from the pros! 🔧

Apr 03 '25 14:04 s0oo

Correct me if I am wrong, but ig you can just edit the exo/models.py and add there. add "deepseek-v3-0324-8bit": { "layers": 61, "repo": { "MLXDynamicShardInferenceEngine": "mlx-community/DeepSeek-v3-0324-8bit", }, }, in the model_cards and add "deepseek-v3-0324-8bit": "Deepseek V3-0324 (8-bit)", in pretty_name

If the model supports MLXDynamicShardInferenceEngine, it should work directly.

try it, if it works do a pull request to add it

Apr 04 '25 10:04 TanishkBansode

I used two Mac Studios with Apple M3 Ultra chips, each with 512GB of unified memory (1TB total Unified Memory). I also modified exo/models.py, just as you recommended. Both Macs are running, and each one is using about 380GB of memory. However, an error occurred during inference. I suspect that modifying only exo/models.py is not enough — there may be other parts of the code that also need to be changed, but I'm not able to handle that part.

Apr 04 '25 11:04 s0oo

There's already a issue for this error( #799 ), still not yet solved though 😔 ! It's related to tensor type bf16, even the one in the models.py ain't working for him Can you try running the model mentioned in the issue?

Apr 04 '25 12:04 TanishkBansode

I tested model #799 as well, and it also gave an error, but the error was different.

Apr 06 '25 01:04 s0oo

when did you git pull the repo? if its old, see this commit. they removed that statement.

Apr 06 '25 01:04 TanishkBansode

I downloaded the version a few days ago, but I just compared it with the latest version, and there haven't been any content changes recently.

Apr 06 '25 07:04 s0oo

I downloaded it, changed the models.py across both machines. Currently it's stuck on "Checking download status..." and won't give me an error or load the model into memory.

Apr 13 '25 08:04 analaboratory

​​🚀 DeepSeek-v3-0324-8bit​​

🚀 DeepSeek-v3-0324-8bit