TensorRT-LLM server.cc:251] failed to enable peer access for some device pairs

System Info

RTX 8*4090 version: TensorRT-LLM: v0.9.0 tensorrtllm_backend: v0.9.0

Who can help?

@kaiyux @BY

Information

[X] The official example scripts
[ ] My own modified scripts

Tasks

[X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

None

Expected behavior

None

actual behavior

None

additional notes

When I deploy llama3-8B in trition server, it raises below error: but, it also print server launch successfully flag: However, when I send requests to server,

How to fix it? Thank you~

Jun 08 '24 06:06 Godlovecui

Hi @Godlovecui ， I saw u're using the 0.9.0 trtllm, is it possible to try the latest main branch and see if the issue still exists or not?

Jun 11 '24 06:06 nv-guomingz

Have you tried

nvidia-smi topo -p2p r

To inspect if the drivers for your GPUS are installed and support the peer to peer access?

Also I have encounterd similar issues where my default GPU installation required me to compile with disabled on the use_custom_all_reduce flag

Jun 17 '24 13:06 TheCodeWrangler

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 15 days."

Jul 18 '24 01:07 github-actions[bot]