cuda-samples
cuda-samples copied to clipboard
Bandwidth Test not working properly when MIG is enabled
Hello,
I am trying to run the bandwidth test in cuda-samples/Samples/1_Utilities/bandwidthTest on a node with 8 A100 GPUs, two of them have MIG enabled.
I am running the test in a Docker container on a Kubernetes platform with the GPU Operator.
When I execute the test, it only sees GPU 0. If I set CUDA_VISIBLE_DEVICES manually, the test will work only on the GPUs without MIG enabled, and default to GPU 0 when running on MIG enabled devices.
On nodes with GPUs that are NOT MIG enabled, the test runs just fine and sees all the devices, regardless of CUDA_VISIBLE_DEVICES being set or not.
Has anyone tested the bandwidth test on MIG enabled devices?
Any help is highly appreciated!