Roman Koshkin comments

Results 11 comments of


                                            Roman Koshkin

Unable to run any model with tensor_parallel_size>1 on AWS sagemaker notebooks

@youkaichao Could you please share a minimal working example for offline inference with tensor-parallel-size > 1?

Unable to run any model with tensor_parallel_size>1 on AWS sagemaker notebooks

@youkaichao let me try se if it works for me. By the way, can you see if llama3-8b works? And what hardware / cuda are you using?

[Bug]: vllm stall on llama3-70b warmup with 0.4.1

Similar problem here: ```bash singularity run \ --nv \ --env HF_HOME=/workspace/huggingface/hub \ --writable-tmpfs \ --bind $volume:/workspace/huggingface/hub \ --env HUGGING_FACE_HUB_TOKEN=$HUGGING_FACE_HUB_TOKEN \ docker://vllm/vllm-openai:v0.4.1 \ --model casperhansen/llama-3-70b-instruct-awq \ --tensor-parallel-size 4 ``` Everything just...

--tensor-parallel-size 2 fails to load on GCP

Have you fixed the issue? I can't run any model with TP > 1

--tensor-parallel-size 2 fails to load on GCP

@chrisbraddock Could you post minimal working code, please? And also, are running in the official vLLM docker container? If not, how did you install vLLM (from source, from pypi)? Are...

--tensor-parallel-size 2 fails to load on GCP

@chrisbraddock I got it working in a very similar way (I described it [here](https://github.com/vllm-project/vllm/issues/4431#issuecomment-2095138681)). The trick was to run `ray` in a separate terminal session and specify `LD_LIBRARY_PATH` correctly.

Roman Koshkin

Unable to run any model with tensor_parallel_size>1 on AWS sagemaker notebooks

Unable to run any model with tensor_parallel_size>1 on AWS sagemaker notebooks

[Bug]: vllm stall on llama3-70b warmup with 0.4.1

--tensor-parallel-size 2 fails to load on GCP

--tensor-parallel-size 2 fails to load on GCP

--tensor-parallel-size 2 fails to load on GCP

Flax Whisper uses a lot of GPU memory

Tensor parallelism on ray cluster

Doesn't work with m-a-p/OpenCodeInterpreter-DS-33B

Error reloading module: 'opentelemetry.proto.collector'