exo icon indicating copy to clipboard operation
exo copied to clipboard

β€‹β€‹πŸš€ DeepSeek-v3-0324-8bit​​

Open s0oo opened this issue 1 year ago β€’ 7 comments

The MLX community just dropped the ​​8-bit quantized version​​ of DeepSeek-V3-0324! πŸ‘‰ ​​https://huggingface.co/mlx-community/DeepSeek-v3-0324-8bit Need urgent support from the pros! πŸ”§

s0oo avatar Apr 03 '25 14:04 s0oo

Correct me if I am wrong, but ig you can just edit the exo/models.py and add there. add "deepseek-v3-0324-8bit": { "layers": 61, "repo": { "MLXDynamicShardInferenceEngine": "mlx-community/DeepSeek-v3-0324-8bit", }, }, in the model_cards and add "deepseek-v3-0324-8bit": "Deepseek V3-0324 (8-bit)", in pretty_name

If the model supports MLXDynamicShardInferenceEngine, it should work directly.

try it, if it works do a pull request to add it

TanishkBansode avatar Apr 04 '25 10:04 TanishkBansode

I used two Mac Studios with Apple M3 Ultra chips, each with 512GB of unified memory (1TB total Unified Memory). I also modified exo/models.py, just as you recommended. Both Macs are running, and each one is using about 380GB of memory. However, an error occurred during inference. I suspect that modifying only exo/models.py is not enough β€” there may be other parts of the code that also need to be changed, but I'm not able to handle that part.

Image

s0oo avatar Apr 04 '25 11:04 s0oo

There's already a issue for this error( #799 ), still not yet solved though πŸ˜” ! It's related to tensor type bf16, even the one in the models.py ain't working for him Can you try running the model mentioned in the issue?

TanishkBansode avatar Apr 04 '25 12:04 TanishkBansode

I tested model #799 as well, and it also gave an error, but the error was different.

Image

s0oo avatar Apr 06 '25 01:04 s0oo

when did you git pull the repo? if its old, see this commit. they removed that statement.

TanishkBansode avatar Apr 06 '25 01:04 TanishkBansode

I downloaded the version a few days ago, but I just compared it with the latest version, and there haven't been any content changes recently.

s0oo avatar Apr 06 '25 07:04 s0oo

I downloaded it, changed the models.py across both machines. Currently it's stuck on "Checking download status..." and won't give me an error or load the model into memory.

analaboratory avatar Apr 13 '25 08:04 analaboratory