open-gpu-kernel-modules icon indicating copy to clipboard operation
open-gpu-kernel-modules copied to clipboard

Unable to change power limit with nvidia-smi

Open machinedgod opened this issue 2 years ago • 179 comments

NVIDIA Open GPU Kernel Modules Version

530.41.03-1

Does this happen with the proprietary driver (of the same version) as well?

Yes

Operating System and Version

Arch Linux

Kernel Release

Linux 6.2.8-arch1-1 #1 SMP PREEMPT_DYNAMIC Wed, 22 Mar 2023 22:52:35 +0000 x86_64 GNU/Linux

Hardware: GPU

GPU 0: NVIDIA GeForce RTX 2060 (UUID: GPU-2f685ce6-33f4-db75-ce05-81d1723a6ddb)

Describe the bug

Before recent update, I was able to execute: ~$ sudo nvidia-smi --power-limit 60

and have it work as expected.

After update, this is the output:

Changing power management limit is not supported for GPU: 00000000:01:00.0.
Treating as warning and moving on.
All done.

Changing power limits caused observable changes both in temperature and in performance, so I am pretty sure my GPU supports it.

For context: I found that default power limit of 80W tends to heat up the GPU enough that it starts throttling itself and cause stuttering - 60W seemed to work perfectly and keep it under 63C during everything, without boosting my fan. The computer itself is a laptop (which explains issues with heat dissipation), a Lenovo Legion Y740, and I have a pretty good cooling pad to help out.

To Reproduce

~# nvidia-smi --power-limit 60

Bug Incidence

Always

nvidia-bug-report.log.gz

nvidia-bug-report.log.gz

More Info

Just a bit more context: you may notice that in my Xorg config, I use option to skip EDID check for HDMI-0 output - the reason why, is because either driver doesn't recognize my (about 7y old) monitor, or the checksum that monitor outputs is invalid, and then it wouldn't let me use FullHD resolution on that monitor.

No idea if this is in any way related (I presume its) not, but this setup worked for over 5-6 months, so I doubt its related.

machinedgod avatar Apr 01 '23 15:04 machinedgod

I also cannot change the power limit with nvidia-smi on this driver while i can on 525 series (I have a RTX 3050 laptop)

On 530 it says "Changing power limit is not supported for this GPU)

So which is intended behavior , make up your mind NVIDIA

I really hope its not the ladder because of this fun issue which also happens to me https://github.com/NVIDIA/open-gpu-kernel-modules/issues/435

ghost avatar Apr 02 '23 21:04 ghost

Relevant discussion here

dylif avatar Apr 05 '23 22:04 dylif

Oh so its a wide-reaching problem, not just for 2060 and 3050. I suppose visibility on it is nice enough, so should probably be fixed soonish.

machinedgod avatar Apr 06 '23 10:04 machinedgod

With 530.30.02 (beta) the power limits were working fine for the first time on my laptop with a RTX 2060 Mobile 6GB. It broke after 530.42.03 update.

msmafra avatar Apr 10 '23 12:04 msmafra

Using the 525 version of the driver, my RTX3060 is able to limit power for the first time, but since updating the 530 version of the driver, it has been impossible to limit power

xiguayuyichao avatar Apr 13 '23 05:04 xiguayuyichao

If anybody here is using a lenovo gaming laptop with the 530 drivers Can anyone confirm this other power limit issue https://github.com/NVIDIA/open-gpu-kernel-modules/issues/492

ghost avatar Apr 15 '23 15:04 ghost

got same issue

youmukonpaku1337 avatar Apr 26 '23 19:04 youmukonpaku1337

on a 3060 AMD legion 5

youmukonpaku1337 avatar Apr 26 '23 19:04 youmukonpaku1337

Legion S7 15ACH6 / AMD Ryzen 7 5800H / NVIDIA GeForce RTX 3060

Same issue

kraskden avatar Jun 03 '23 10:06 kraskden

driver version?

Legion S7 15ACH6 / AMD Ryzen 7 5800H / NVIDIA GeForce RTX 3060

Same issue

youmukonpaku1337 avatar Jun 03 '23 10:06 youmukonpaku1337

driver version?

530.41.03-15

kraskden avatar Jun 03 '23 13:06 kraskden

driver version?

530.41.03-15

try 535 beta

youmukonpaku1337 avatar Jun 03 '23 13:06 youmukonpaku1337

unless it doesn't exist for okm then test proprietary

youmukonpaku1337 avatar Jun 03 '23 13:06 youmukonpaku1337

Still have the issue on 535 BETA proprietary which is the only one with the "fix" i think

ghost avatar Jun 03 '23 15:06 ghost

hmmm

youmukonpaku1337 avatar Jun 03 '23 16:06 youmukonpaku1337

test with 525 branch (prop, or okm if exists)

youmukonpaku1337 avatar Jun 03 '23 16:06 youmukonpaku1337

Ryzen 5 5600H / NVIDIA GeForce RTX 3060 Laptop Same issue on version 525.78.01 everything works fine

arutar avatar Jun 04 '23 17:06 arutar

Ryzen 5 5600H / NVIDIA GeForce RTX 3060 Laptop Same issue on version 525.78.01 everything works fine

did you test 535 (prop)

youmukonpaku1337 avatar Jun 04 '23 17:06 youmukonpaku1337

@barokvanzieks Yesterday I checked the work of the power limit on many versions of the drivers. (Ryzen 5 5600H / NVIDIA GeForce RTX 3060 Laptop / Ubuntu 20.04) (command: nvidia-smi -pl 50)

  • 525.78.01 power limit works fine
  • 525.85.05 power limit works fine
  • 525.89.02 power limit NOT CHECKED
  • 525.105.17 power limit works fine
  • 530.41.03 power limit is NOT WORKING
  • 535.43.02 power limit is NOT WORKING

it turns out the power limit does not work since the version: 53X.XX.XX

The power limit works on all 525.xx.xx drivers

arutar avatar Jun 06 '23 08:06 arutar

hmmmmm ok i havent tested 535 myself, guess ill stay on 525 for now

youmukonpaku1337 avatar Jun 06 '23 14:06 youmukonpaku1337

i checked latest version of nvidia driver "535.54.03" with dkms and still NOT WORKING

(i5-12500H / NVIDIA GeForce RTX 3060 Laptop / EndeavourOS)

Diafwl avatar Jun 15 '23 10:06 Diafwl

I think this is intended at this point

They also unlocked it on windows around driver 528 and locked it back shortly after on laptops so maybe this is only for dekstop GPU's

ghost avatar Jun 15 '23 12:06 ghost

@kleidiss This is a very important opportunity. Laptops get very hot and drain the battery quickly. This feature is required!

arutar avatar Jun 16 '23 08:06 arutar

I agree. It would be worthwhile checking if anyone diffed the sources and figured out how to make a patch to enable it back. If not, I will attempt that this weekend, because right now - my laptop goes to 80-81C (with a REALLY good cooling pad)... previously I kept the framerate stable at 55C

machinedgod avatar Jun 16 '23 09:06 machinedgod

I just noticed it did not work anymore on my system, reverting back to 525.105.17 and everythings good. Thanks you

The laptop is a TongFang GM5MG0O (bios N.1.09A08) // i7 10875H // RTX3070 (bios 94.04.3F.00.83) The OS is Linux Mint Linux Mint 21.1 // kernel 6.3.8-x64v3-xanmod1

notnotme avatar Jun 17 '23 17:06 notnotme

I agree. It would be worthwhile checking if anyone diffed the sources and figured out how to make a patch to enable it back. If not, I will attempt that this weekend, because right now - my laptop goes to 80-81C (with a REALLY good cooling pad)... previously I kept the framerate stable at 55C

thatd be super cool of you

youmukonpaku1337 avatar Jun 18 '23 16:06 youmukonpaku1337

but yes 535 isnt workinf for me either and its a pain in the arse since now i have to fuck with drivers to get OpenCL working

youmukonpaku1337 avatar Jun 18 '23 16:06 youmukonpaku1337

this feature is essential for me since i can up wattage by like 50 watts from stock 80

youmukonpaku1337 avatar Jun 18 '23 16:06 youmukonpaku1337

I agree. It would be worthwhile checking if anyone diffed the sources and figured out how to make a patch to enable it back. If not, I will attempt that this weekend, because right now - my laptop goes to 80-81C (with a REALLY good cooling pad)... previously I kept the framerate stable at 55C

question is which file would it be in...

youmukonpaku1337 avatar Jun 20 '23 00:06 youmukonpaku1337

Any updates? When is it going to work again?

darxkies avatar Jun 26 '23 13:06 darxkies