gpu-manager icon indicating copy to clipboard operation
gpu-manager copied to clipboard

Gpumanager is unable to control GPU threshold and GPU memory.

Open yangcheng-dev opened this issue 2 years ago • 2 comments

It seems that after deploying the gpumanager project, there is an issue with controlling GPU memory. I have set the pod quotas as follows, but the control over GPU memory and computing power does not take effect. What's even more peculiar is that if I wait for the same pod for over 60 minutes, the restrictions are likely to take effect. There are no obvious errors in the gpumanager logs. resources: limits: tencent.com/vcuda-core: "50" tencent.com/vcuda-memory: "32" requests: tencent.com/vcuda-core: "50" tencent.com/vcuda-memory: "32"

yangcheng-dev avatar Jan 18 '24 06:01 yangcheng-dev

Has this problem already been solved?

ferris-cx avatar Jul 01 '24 06:07 ferris-cx

It seems that after deploying the gpumanager project, there is an issue with controlling GPU memory. I have set the pod quotas as follows, but the control over GPU memory and computing power does not take effect. What's even more peculiar is that if I wait for the same pod for over 60 minutes, the restrictions are likely to take effect. There are no obvious errors in the gpumanager logs. resources: limits: tencent.com/vcuda-core: "50" tencent.com/vcuda-memory: "32" requests: tencent.com/vcuda-core: "50" tencent.com/vcuda-memory: "32"

Just remove the cuGetProcessAddress implement,it will cause this problem.

ScaletKlazz avatar Jul 08 '24 09:07 ScaletKlazz