Ahmad M. Osman
Ahmad M. Osman
Same issue here
@njhill @tjohnson31415 can we try to get this merged in?
Any progress update on this one?
@dominicshanshan Is this available on the main branch or somewhere else? I can try building the docker image and experiment with it
Tested with https://huggingface.co/cognitivecomputations/Qwen3-235B-A22B-AWQ and experienced the same issue, a quick Google search got me here. TP=8, with 8x RTX 3090s.
How is this effort going? Would love to experiment with this on my AI server w/ Tensor Parallelism enabled.