ZelinTan
ZelinTan
Hi!I also run into this weird problem.Have you solved it?
Hi, I have just solved the problem, try using pip install --upgrade pip to upgrade pip(>= 21.3).Then try again run pip install -e . I recognize that you are interested...
thanks Henry, your reply is really helpful
besides, I sincerely recommend that you try the same operation I mentioned above on FPGA compute node.When I did those operations on A10,compilation also falied.
@geraldstanje I tried the resnet example in https://pytorch.org/TensorRT/tutorials/_rendered_examples/dynamo/torch_compile_resnet_example.html with : `| NVIDIA-SMI 470.103.01 Driver Version: 470.103.01 CUDA Version: 11.8 |` The GPU is Nvidia-A100 80G and run nvcc --version: ```...
> We (Alibaba Cloud) are actively developing a disaggregated prefilling feature for vLLM to tackle latency issues and minimize interference during prefilling and decoding. Leveraging fully asynchronous I/O, it ensures ...
@richardodliu could you please give us an example so that we can locate the problem more efficiently?
will take a look on it soon
Meanwhile, I found that when I want ep(dp)=4, but tp set to 1, there is only one GPU running but not 4 GPUS running in my machine... don't know why
Thanks for your reply! I also want to ask if it is possible to turn off DP while enabling EP? Because in my opinion, EP and TP working together can...