gamemode icon indicating copy to clipboard operation
gamemode copied to clipboard

Gpu device never parsed?

Open Ciflire opened this issue 1 year ago • 9 comments

Describe the bug gamemoded -t does not look for the device set in settings, it always looks for default value which is 0

To Reproduce Steps used to reproduce the behavior:

  1. configure your gpu device to be different from 0
  2. run gamemoded -t
  3. get the following error ERROR: Couldn't open vendor file at /sys/class/drm/card0/device/vendor, will not apply gpu optimisations!
  4. See that it's looking for card0 even if gpu device is not 0

Expected behavior I do not expect it to pass all the tests but at least not for it to look for card 0 when specified 1

System Info (please complete the following information):

  • OS and version: Nixos 24.11
  • GameMode Version 1.8.1

Additional context this is my config, generated by nix in /etc/gamemode.ini, it is found by process

[custom]
end=/nix/store/qjc7b5l019p260qv9r7w39hsap656jbl-libnotify-0.8.3/bin/notify-send 'GameMode ended'
start=/nix/store/qjc7b5l019p260qv9r7w39hsap656jbl-libnotify-0.8.3/bin/notify-send 'GameMode started'

[general]
disable_splitlock=0
igpu_desiredgov=performance
inhibit_screensaver=0
renice=10

[gpu]
apply_gpu_optimisations=accept-responsibility
gpu_device=1
nv_powermizer_mode=2

i do belive but could be wrong that you never parse this part of the config and keep it to default i belive the device should be set here https://github.com/FeralInteractive/gamemode/blob/master/daemon/gamemode-gpu.c#L74-L75 but this symbol leads to a reference in https://github.com/FeralInteractive/gamemode/blob/master/daemon/gamemode-config.h#L116-L117 but i think is never implemented

Ciflire avatar Jul 10 '24 23:07 Ciflire

i had the same issue. A thing i noticed is that when it is set to 2 it does use card2, i guess for some reason it just doesn't like card1 lmao.

OS: Nixos 24.05 GameMode Version 1.8.1

MrSn0wy avatar Jul 23 '24 19:07 MrSn0wy

Same issue, it doesn't seem to successfully read the config when gpu_device=1 but does when it's gpu_device=2.

This seems to be the relevant code for reading the config: https://github.com/FeralInteractive/gamemode/blob/c54d6d4243b0dd0afcb49f2c9836d432da171a2b/daemon/gamemode-config.c#L803

https://github.com/FeralInteractive/gamemode/blob/c54d6d4243b0dd0afcb49f2c9836d432da171a2b/daemon/gamemode-config.c#L60-L66

https://github.com/FeralInteractive/gamemode/blob/c54d6d4243b0dd0afcb49f2c9836d432da171a2b/daemon/gamemode-config.c#L452-L462

And this is the place that calls that code: https://github.com/FeralInteractive/gamemode/blob/c54d6d4243b0dd0afcb49f2c9836d432da171a2b/daemon/gamemode-gpu.c#L74-L86

And this is what ends up being 0 instead of 1: https://github.com/FeralInteractive/gamemode/blob/c54d6d4243b0dd0afcb49f2c9836d432da171a2b/common/common-gpu.c#L35-L49

ERROR: Couldn't open vendor file at /sys/class/drm/card0/device/vendor, will not apply gpu optimisations!

I can't immediately see an error here just from looking at the code, as it seems to just boil down to a memcpy followed by a snprintf. I guess someone would need to get a local build running to debug it further.

@afayaz-feral Sorry for the direct ping, would you have any guess why this function fails to work when the value is 1?

djahandarie avatar Sep 08 '24 01:09 djahandarie

Here's what I'm seeing:

::: Verifying GPU Optimisations
ERROR: Couldn't open vendor file at /sys/class/drm/card0/device/vendor, will not apply gpu optimisations!
ERROR: Failed to find Nvidia GPU with expected index!

The error message is from here:

https://github.com/FeralInteractive/gamemode/blob/c54d6d4243b0dd0afcb49f2c9836d432da171a2b/util/gpuclockctl.c#L118

This loop doesn't use gpu_device, instead it tries every card until hitting Vendor_Invalid, which is supposed to mean "beyond the last card", but in my case card0 doesn't have a vendor file so the loop terminates early:

$ ls -l /sys/class/drm/card0/device/
total 0
lrwxrwxrwx 1 root root    0 Oct 15 16:49 driver -> ../../../bus/platform/drivers/simple-framebuffer
-rw-r--r-- 1 root root 4096 Oct 15 18:05 driver_override
drwxr-xr-x 3 root root    0 Oct 15 18:05 drm
drwxr-xr-x 3 root root    0 Oct 15 18:05 graphics
-r--r--r-- 1 root root 4096 Oct 15 18:05 modalias
drwxr-xr-x 2 root root    0 Oct 15 18:05 power
lrwxrwxrwx 1 root root    0 Oct 15 16:49 subsystem -> ../../../bus/platform
-rw-r--r-- 1 root root 4096 Oct 15 18:05 uevent

So the bug is that every card is assumed to have a /vendor file, which isn't the case for simple-framebuffer.

pmarks-net avatar Oct 15 '24 22:10 pmarks-net

Implementing #364 would've been nice.

elsandosgrande avatar Jan 17 '25 21:01 elsandosgrande

So this bug is difficult to fix? It's been 7 months.

GreatBigWhiteWorld avatar Feb 01 '25 16:02 GreatBigWhiteWorld

it's parsed correctly, however, device vendor file fails to be opened even in readonly mode, and the error message is thus misleading.

Yes, a service can run with restricted permissions due to security mechanisms like SELinux, AppArmor, or restricted capabilities, which may block access even in read-only mode.

        LOG_MSG("Checking vendor file at %s\n", path);
here >>>>  FILE *file = fopen(path, "r");
	if (!file) {
		LOG_ERROR("Couldn't open vendor file at %s, will not apply gpu optimisations!\n", path);
		return Vendor_Invalid;
	}

test launch with sudo makes it succeed, but then we have consequences of rooty unpredictable behavior

house-intellect avatar Feb 09 '25 23:02 house-intellect

Just to say that I'm also affected by this but in a slightly different case, I have no /sys/class/drm/card0. On my laptop, I disabled the integrated GPU and only the Nvidia RTX 3070 is enabled. On Fedora, I see only one card and it is numbered as 1:

ls /sys/class/drm
card1       card1-eDP-1     renderD128
card1-DP-1  card1-HDMI-A-1  version

TheWall89 avatar Feb 16 '25 22:02 TheWall89

Why hasn't this been fixed yet?

rocketguedes avatar Apr 18 '25 19:04 rocketguedes

Just encountered this problem, seems very odd. Does anyone knows if it's possible to workaround this problem with a symlink? ln -s seems to be not working

love2kick avatar Jun 20 '25 21:06 love2kick