ZelinTan

Results 14 comments of ZelinTan

Hi!I also run into this weird problem.Have you solved it?

Hi, I have just solved the problem, try using pip install --upgrade pip to upgrade pip(>= 21.3).Then try again run pip install -e . I recognize that you are interested...

thanks Henry, your reply is really helpful

besides, I sincerely recommend that you try the same operation I mentioned above on FPGA compute node.When I did those operations on A10,compilation also falied.

@geraldstanje I tried the resnet example in https://pytorch.org/TensorRT/tutorials/_rendered_examples/dynamo/torch_compile_resnet_example.html with : `| NVIDIA-SMI 470.103.01 Driver Version: 470.103.01 CUDA Version: 11.8 |` The GPU is Nvidia-A100 80G and run nvcc --version: ```...

> We (Alibaba Cloud) are actively developing a disaggregated prefilling feature for vLLM to tackle latency issues and minimize interference during prefilling and decoding.  Leveraging fully asynchronous I/O, it ensures ...

@richardodliu could you please give us an example so that we can locate the problem more efficiently?

Meanwhile, I found that when I want ep(dp)=4, but tp set to 1, there is only one GPU running but not 4 GPUS running in my machine... don't know why

Thanks for your reply! I also want to ask if it is possible to turn off DP while enabling EP? Because in my opinion, EP and TP working together can...