k8s-device-plugin icon indicating copy to clipboard operation
k8s-device-plugin copied to clipboard

Is there any way in the meantime to request more than 1 replica from each GPU in my node

Open arthas3014 opened this issue 1 year ago • 5 comments

I have started MPS and used 10 as the division factor, but in our application scenario, we might directly allocate 2 whole GPUs, which is equivalent to specifying nvidia.com/gpu: 20. If I set nvidia.com/gpu > 1, I encounter the error: ‘request for “nvidia.com/gpu”: invalid request: maximum request size for shared resources is 1; found 10, which is unexpected’.

Is there any way in the meantime to request more than 1 replica from each GPU in my node?

arthas3014 avatar Mar 18 '25 08:03 arthas3014

any one has idea?

arthas3014 avatar Mar 18 '25 09:03 arthas3014

No, requesting multiple replicas does not give you more access to a shared GPU.

chipzoller avatar Mar 25 '25 09:03 chipzoller

No, requesting multiple replicas does not give you more access to a shared GPU.

Isn't that only a restriction for the timeSlicing? For MPS, we should be able to assign more than one to the container POD. Applications such as triton-server should be able to use multiple GPUs.

Based on this example, you can bump up the limit to 2:

5. Update the manifest to request 2 nvidia.com/gpu:


  resources:
        limits:
          nvidia.com/gpu: 2

Based on above doc, the pod will still see one GPU, but that has double the memory, etc, so it is 2x powerful as 1 MPS.

gfrankliu avatar Jun 20 '25 21:06 gfrankliu

Is there any progress on this issue?

ben-wangz avatar Sep 10 '25 07:09 ben-wangz

This issue is stale because it has been open 90 days with no activity. This issue will be closed in 30 days unless new comments are made or the stale label is removed. To skip these checks, apply the "lifecycle/frozen" label.

github-actions[bot] avatar Dec 10 '25 04:12 github-actions[bot]

This issue was automatically closed due to inactivity.

github-actions[bot] avatar Jan 09 '26 04:01 github-actions[bot]