open-gpu-kernel-modules icon indicating copy to clipboard operation
open-gpu-kernel-modules copied to clipboard

Dynamic Boost doesn't work on AMD CPUs

Open cyberphantom52 opened this issue 3 years ago • 15 comments

NVIDIA Open GPU Kernel Modules Version

7c345b838b8a63126ed425781400c1ae8a7a4a1d

Does this happen with the proprietary driver (of the same version) as well?

Yes

Operating System and Version

Fedora 37 Workstation

Kernel Release

6.0.2-300.fc37.x86_64

Hardware: GPU

NVIDIA GeForce RTX 3060 Laptop GPU

Describe the bug

nvidia-powerd was added to the linux driver to enable dynamic boost but currently its completely broken on all AMD-NVIDIA based laptops because powerd service is unable to read the cpu powerdata. I have attched the output of service log below.

× nvidia-powerd.service - nvidia-powerd service
     Loaded: loaded (/usr/lib/systemd/system/nvidia-powerd.service; disabled; preset: enabled)
     Active: failed (Result: exit-code) since Mon 2022-10-17 13:18:00 IST; 13s ago
    Process: 10710 ExecStart=/usr/bin/nvidia-powerd (code=exited, status=1/FAILURE)
   Main PID: 10710 (code=exited, status=1/FAILURE)
        CPU: 1.148s

Oct 17 13:17:58 rog systemd[1]: Starting nvidia-powerd.service - nvidia-powerd service...
Oct 17 13:17:58 rog /usr/bin/nvidia-powerd[10710]: nvidia-powerd version:1.0(build 1)
Oct 17 13:18:00 rog /usr/bin/nvidia-powerd[10710]: Failed to read the data for calculating CPU power
Oct 17 13:18:00 rog /usr/bin/nvidia-powerd[10710]: Failed to initialize GPU Boost controller
Oct 17 13:18:00 rog systemd[1]: nvidia-powerd.service: Main process exited, code=exited, status=1/FAILURE
Oct 17 13:18:00 rog systemd[1]: nvidia-powerd.service: Failed with result 'exit-code'.
Oct 17 13:18:00 rog systemd[1]: Failed to start nvidia-powerd.service - nvidia-powerd service.
Oct 17 13:18:00 rog systemd[1]: nvidia-powerd.service: Consumed 1.148s CPU time.

To Reproduce

  1. Check power consumption in GPU heavy programs.
  2. Service logs indicate nvidia-powerd service cannot start.

Bug Incidence

Always

nvidia-bug-report.log.gz

nvidia-bug-report.log.gz

More Info

CPU : AMD Ryzen 9 5900HS

https://github.com/NVIDIA/open-gpu-kernel-modules/issues/392#tasklist-block-9f6513b6-e8f2-493e-a77d-1f386a5593e3

cyberphantom52 avatar Oct 17 '22 08:10 cyberphantom52

Thanks, we're aware of the limitation in nvidia-powerd; that should be addressed in an upcoming release.

aritger avatar Oct 20 '22 21:10 aritger

@aritger Will the fix address all AMD CPU's or just 5000 series? I ask as we experience this issue on 4000 series Ryzen CPU's as well

DualityKyle avatar Oct 22 '22 14:10 DualityKyle

@aritger Hey, the recent 525.53 beta added support for Dynamic Boost on AMD but its limited to Rembrandt and newer AMD chips. Are there plans to support older 4000 and 5000 series chips. It doesn't make any sense to not support as most people don't have the latest AMD chips.

cyberphantom52 avatar Nov 10 '22 21:11 cyberphantom52

Sorry, I don't know off hand why the support is limited to those chipsets, but I'll try to find out.

aritger avatar Nov 10 '22 22:11 aritger

Sorry, I don't know off hand why the support is limited to those chipsets, but I'll try to find out.

We'd really appreciate that, AMD users have not been able to use their nvidia hardware at full potential for so long, i hope it really gets resolved now.

cyberphantom52 avatar Nov 10 '22 23:11 cyberphantom52

Don't worry. Its not exactly like the newer chips with NVIDIA are working very well... or at all on laptops.

Grimish-ng avatar Nov 13 '22 02:11 Grimish-ng

Sorry, I don't know off hand why the support is limited to those chipsets, but I'll try to find out.

Did the bug ever get fixed on 5000 series AMd CPU's Because i am still having issues as i reported here

https://github.com/NVIDIA/open-gpu-kernel-modules/issues/435

ghost avatar Jan 02 '23 15:01 ghost

@aritger

hi, any updates on AMD 5th Gen cezanne support?, is there any technical limitation that's stopping you guys from supporting it?

sinanmohd avatar Feb 28 '23 02:02 sinanmohd

Could I trouble you to retest with 535.43.02? The release highlights call out:

    * Extended Dynamic Boost support on notebooks to include older Renoir
      and Cezanne chipsets, in addition to Rembrandt and newer AMD chipsets.

https://www.nvidia.com/Download/driverResults.aspx/205039/en-us/

aritger avatar May 30 '23 19:05 aritger

@aritger

I have a Cezanne APU (Ryzen 5 5600H) and Dynamic boost still is wonky It needs to reload the nvidia-powerd service or the system in order to change the gpu clock speeds and power limit

As you can see in the screenshot i will provide , the system is in performance mode but the nvidia-settings still shows that the full TGP isn't used If i do the same test after a restart the TGP will spike up to 85W

And this goes vice versa from 85W to 64W if i change from performance to balanced mode

Screenshot_20230602_094156

ghost avatar Jun 02 '23 13:06 ghost

I forgot to mention but i am indeed using the 535 driver

ghost avatar Jun 02 '23 14:06 ghost

Dear All, We have already filed a bug 4142071 for this issue and it has been also root caused. We will integrate the fix in future release drivers and shall update accordingly.

amrit1711 avatar Jun 05 '23 09:06 amrit1711

Hey is this bug fixed in the new production 535 driver?

Currently cannot test it

ghost avatar Jun 14 '23 14:06 ghost

I keep getting the bug even with 550 version open driver and private driver,my gpu is 3050 laptop and cpu is 5800h.

ghost avatar Feb 03 '24 08:02 ghost

Seems like the issue is still not resolved.

HP Omen 15 with Ryzen 7 5800H and RTX 3060. EndeavourOS with KDE Linux Kernel 6.7.4-arch1-1 nvidia-dkms 545.29.06

GPU is stuck at 80W, Dynamic Boost should add 20W.

nvidia-smi -pl xxx gives an error

Changing power management limit is not supported in current scope for GPU: 00000000:01:00.0.

checkinindza avatar Feb 11 '24 11:02 checkinindza