rltakashige comments

Results 30 comments of


                                            rltakashige

how to load model in exo

Probably quite a bit easier in EXO 1.0 - just select it from the dropdown.

Two M3 Ultra are running the model qwen3-coder-480b-a35b-8bit. An error is thrown when using VS Code Client.

I've also seen this issue when running DeepSeek V3.1 in pipeline parallel (it does not happen in tensor parallel). Have not encountered this in Qwen Coder, but do want to...

Network discovery over VPN

https://github.com/exo-explore/exo/issues/879#issuecomment-3670942858

exo labs seems to be suffering form a possible memory leak

Seems to be more of an issue when running transformers. This is noted. Please reraise if this is an issue.

[MEDIUM] Fix Local Network permissions in MacOS App

Thanks for a quick response. However, we've mainly been testing on MacOS 26.2, which shouldn't have this issue. Unless it's a regression?

[HARD] Support arbitrary tensor parallel splits

Correct me if I misunderstood your reply - a model is composed of multiple transformer blocks (as well as some other modules we don't care too much about). In pipeline...

Run custom MLX models

GPT OSS and GLM sharding support is around the corner, as well as a few more types of Qwen models. There is a transformers version incompatibility with Ministral3 models, which...

out of memory on long chat input

Noted! This is certainly a feature we will be looking to implement soon.

Add differentiation between available and total memory

Thanks for the contribution! Looks like there's a lot of effort put into it. Although I haven't gone through the PR in detail yet, it seems like a good start...

RuntimeError - "Item size 2 for PEP 3118 buffer format string B does not match the dtype B item size 1"

Should no longer be an issue in 1.0 or future planned updates - but this happens as numpy does not support bfloat16 as a dtype.