clusterplex Worker can't find iHD_drv

Describe the bug When trying to play a transcoded video via a worker, the video fails to play. Worker logs indicate it cannot find iHD_drv_video.so. When I disable ClusterPlex and just use my "normal" PMS pod, HW transcoding works fine.

Intel GPU drivers are installed via Intel device plugins Helm chart: https://intel.github.io/helm-charts/

Same issue happens when using either standard Plex image with DOCKER_MOD or the ClusterPlex image

Relevant log file for worker:

[AVHWDeviceContext @ 0x7fa6496df6c0] libva: VA-API version 1.18.0
[AVHWDeviceContext @ 0x7fa6496df6c0] libva: Trying to open /config/Library/Application Support/Plex Media Server/Cache/va-dri-linux-x86_64/iHD_drv_video.so
[AVHWDeviceContext @ 0x7fa6496df6c0] libva: va_openDriver() returns -1
[AVHWDeviceContext @ 0x7fa6496df6c0] libva: Trying to open /config/Library/Application Support/Plex Media Server/Cache/va-dri-linux-x86_64/i965_drv_video.so
[AVHWDeviceContext @ 0x7fa6496df6c0] libva: va_openDriver() returns -1
[AVHWDeviceContext @ 0x7fa6496df6c0] Failed to initialise VAAPI connection: -1 (unknown libva error).
Device creation failed: -5.
Failed to set value 'vaapi=vaapi:/dev/dri/renderD128' for option 'init_hw_device': I/O error
Error parsing global options: I/O error
Completed transcode
Removing process from taskMap

The /config/Library/Application Support/ folder is empty, so it explains why it can't find the driver. Tried placing the driver that I pulled off the Plex server in the codecs PV, but no difference.

Environment K3S v1.26.5+k3s1 Nodes are Beelink U59's with Intel N5105 processor

Jun 15 '23 18:06 kenlasko

Is that with the worker having the FFMPEG_HWACCEL environment variable set to "vaapi"?

Jun 21 '23 17:06 pabloromeo

Yes, it is. Here's the relevent ConfigMap:

---
kind: ConfigMap
apiVersion: v1
metadata:
  name: clusterplex-worker-config
  namespace: media-tools
  labels:
    app.kubernetes.io/name: clusterplex-worker-config
    app.kubernetes.io/part-of: plex
data:
  TZ: America/Toronto
  PGID: '1000'
  PUID: '1000'
  VERSION: docker
  DOCKER_MODS: 'ghcr.io/pabloromeo/clusterplex_worker_dockermod:latest'
  ORCHESTRATOR_URL: 'http://clusterplex-orchestrator:3500'
  LISTENING_PORT: '3501'
  STAT_CPU_INTERVAL: '10000'
  EAE_SUPPORT: '1'
  FFMPEG_HWACCEL: 'vaapi'

Jun 21 '23 19:06 kenlasko

This issue is stale because it has been open for 30 days with no activity.

Jul 22 '23 02:07 github-actions[bot]

I'm having the same issue.

Logging into the container, it looks like Plex isn't "fully-installed" there should be a cache with the extensions in those folders. See this reddit discussion, as it's the same error. https://www.reddit.com/r/PleX/comments/12ikwup/plex_docker_hardware_transcoding_issue/

Jul 24 '23 22:07 todaywasawesome

What's odd to me is that local transcode works, its only on the remote workers that they fail.

Jul 24 '23 23:07 todaywasawesome

@kenlasko @pabloromeo Ok, I got it working. The clue was the fact that Plex didn't have it's config directory setup in the worker nodes. Plex needs it's configuration otherwise it's going to fail because Plex basically isn't setup. Here's how I fixed it:

Change clusterplex-config-pvc PVC to ReadWriteMany
Add the config mount to the clusterplex-worker statefulset just like you've already done with the pms deployment.

Here's what my two files look like, though yours will look different depending on storage.

Clusterplex-worker

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: clusterplex-worker
  labels:
    app.kubernetes.io/name: clusterplex-worker
    app.kubernetes.io/part-of: clusterplex
spec:
  serviceName: clusterplex-worker-service
  replicas: 2
  selector:
    matchLabels:
      app.kubernetes.io/name: clusterplex-worker
      app.kubernetes.io/part-of: clusterplex
  template:
    metadata:
      labels:
        app.kubernetes.io/name: clusterplex-worker
        app.kubernetes.io/part-of: clusterplex
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchLabels:
                  name: clusterplex-worker
              topologyKey: kubernetes.io/hostname
            weight: 100
          - podAffinityTerm:
              labelSelector:
                matchLabels:
                  name: clusterplex-pms
              topologyKey: kubernetes.io/hostname
            weight: 50
      containers:
      - name: plex-worker
        image: lscr.io/linuxserver/plex:latest
        startupProbe:
          httpGet:
            path: /health
            port: 3501
          failureThreshold: 40
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health
            port: 3501
          initialDelaySeconds: 60
          timeoutSeconds: 5
        livenessProbe:
          httpGet:
            path: /health
            port: 3501
          initialDelaySeconds: 10
          timeoutSeconds: 10
        ports:
          - name: worker
            containerPort: 3501
        envFrom:
        - configMapRef:
            name: clusterplex-worker-config
        volumeMounts:
        - name: data
          mountPath: /data
        - name: codecs
          mountPath: /codecs
        - name: data
          mountPath: /transcode
        - name: config
          mountPath: /config
        resources:              # adapt requests and limits to your needs
          requests:
            cpu: 500m
            memory: 200Mi
          limits:
            gpu.intel.com/i915: 1
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: "plex-media"
      - name: config
        persistentVolumeClaim:
          claimName: "clusterplex-config-pvc"
      # - name: transcode
      #   persistentVolumeClaim:
      #     claimName: "plex-media"
  volumeClaimTemplates:
    - metadata:
        name: codecs
        labels:
          app.kubernetes.io/name: clusterplex-codecs-pvc
          app.kubernetes.io/part-of: clusterplex
      spec:
        accessModes: [ "ReadWriteOnce" ]
        resources:
          requests:
            storage: 1Gi
        # specify your storage class
        storageClassName: longhorn

clusterplex-config-pvc

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: clusterplex-config-pvc
  labels:
    app.kubernetes.io/name: clusterplex-config-pvc
    app.kubernetes.io/part-of: clusterplex
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: "10Gi"
  # specify your storage class
  storageClassName: longhorn

Jul 26 '23 14:07 todaywasawesome

I see! Yeah, the fact that Plex is not set up in the Workers is actually intentional. It shouldn't really be necessary, since the intention is to really only use the Plex transcoder (their fork from FFmpeg), without actually interacting with the local plex files. We use their base image to avoid redistributing their own transcoder ourselves, but plex doesn't really run on the worker. It's odd that it wants to use drivers within Plex's cache instead of the ones you installed on the node.

The reason we don't recommend sharing Plex's config in that way, using shares, is because Plex uses SQLLite as a database, which does not play well with network shares. And Longhorn's RWX is implemented with NFS behind the scenes. So you might end up corrupting the database or seeing odd issues. Maybe you can mount JUST the cache location, to avoid any db corruption. meaning, just sharing /config/Library/Application Support/Plex Media Server/Cache/ or /config/Library/Application Support/Plex Media Server/Cache/va-dri-linux-x86_64/

I'll see if I can set up a physical environment similar to yours, to see if there's a way around that. Maybe driver paths must be rewritten or something like that. I know others are running it with intel drivers on k8s, but I'm not aware if they had to do this same workaround or not.

Jul 26 '23 15:07 pabloromeo

@pabloromeo excellent, I've been thinking about potential issues with my setup and what you've said makes sense. I'll try to see if I can do just the cache.

Jul 26 '23 16:07 todaywasawesome

I mounted Plex config in a different directory, then exec'd into the container and copied just the cache. No go, it throws errors.

[AVHWDeviceContext @ 0x7fdfdb7b2980] libva: Trying to open /config/Library/Application Support/Plex Media Server/Cache/va-dri-linux-x86_64/iHD_drv_video.so
[AVHWDeviceContext @ 0x7fdfdb7b2980] libva: va_openDriver() returns -1
[AVHWDeviceContext @ 0x7fdfdb7b2980] libva: Trying to open /config/Library/Application Support/Plex Media Server/Cache/va-dri-linux-x86_64/i965_drv_video.so
[AVHWDeviceContext @ 0x7fdfdb7b2980] libva: va_openDriver() returns -1
[AVHWDeviceContext @ 0x7fdfdb7b2980] Failed to initialise VAAPI connection: -1 (unknown libva error).
Device creation failed: -5.
Failed to set value 'vaapi=vaapi:/dev/dri/renderD128' for option 'init_hw_device': I/O error
Error parsing global options: I/O error
Completed transcode
Removing process from taskMap

After that, I copied everything from the temp folder and hardware transcoding works fine.

We might actually be running into something to do with Plex having to be on premium and have a claim token to run hw transcoding.

Another formulation I tried, adding the plex config as readonly, unfortunately the workers can't start because they can't run the fix permissions scripts that happen on start.

Jul 26 '23 21:07 todaywasawesome

This issue is stale because it has been open for 30 days with no activity.

Aug 27 '23 02:08 github-actions[bot]

I am doing a helm chart deployment and ran into the issue. I already had to customize the charts to use env in the config for HW transcoding variable for workers, so I customized it to include the config and it no longer errors too. Not too knowledgeable on editing helm charts nor Plex but what if we make the directory or files with the sqlite DBS to be mounted read only?

Sep 10 '23 16:09 seang96

Hello, I just started using this and came across this issue while verifying settings for HW Transcode on my NUC cluster.

Thanks for finding this issue before I experienced it :)

@todaywasawesome , I noticed the iHD_drv_video.so you referenced wasnt actually in the Plex Media Server/Cache, but linked to it from Plex Media Server/Drivers/imd-74-linux-x86_64/dri/iHD_drv_video.so'.

To get around the issue with both sharing the Cache and Drivers folders with the workers, as ReadOnly, but excluding other config so as not to disturb the DB, I have:

Left the existing Config PVC as ReadWriteOnce and NOT mounted it to the Worker
Created additional tiny PVCs for Cache and Drivers, mounted on PMS and Worker containers in appropriate locations, Worker nodes ReadOnly. 1Gi is overkill but I did 5Gi just in case.

Additional Cache and Driver PVC


---
#cluster-plex_cache-pvc.yml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: clusterplex-cache-pvc
  namespace: plex-ns
  labels:
    app.kubernetes.io/name: clusterplex-cache-pvc
    app.kubernetes.io/part-of: clusterplex
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 5Gi
  storageClassName: longhorn
---
#cluster-plex_drivers-pvc.yml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: clusterplex-drivers-pvc
  namespace: plex-ns
  labels:
    app.kubernetes.io/name: clusterplex-drivers-pvc
    app.kubernetes.io/part-of: clusterplex
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 5Gi
  storageClassName: longhorn

Worker: (PMS is the same excluding the readOnly; true on the spec.volumes)

   containers:
      - name: plex-worker
        image: lscr.io/linuxserver/plex:latest
        startupProbe:
          httpGet:
            path: /health
            port: 3501
          failureThreshold: 40
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health
            port: 3501
          initialDelaySeconds: 60
          timeoutSeconds: 5
        livenessProbe:
          httpGet:
            path: /health
            port: 3501
          initialDelaySeconds: 10
          timeoutSeconds: 10
        ports:
          - name: worker
            containerPort: 3501
        envFrom:
        - configMapRef:
            name: clusterplex-worker-config
        volumeMounts:
        - name: media
          mountPath: /mnt/media
        - name: codecs
          mountPath: /codecs
        - name: transcode
          mountPath: /transcode
        - name: cache
          mountPath: /config/Library/Application Support/Plex Media Server/Cache
        - name: driver
          mountPath: /config/Library/Application Support/Plex Media Server/Drivers
        resources:              # adapt requests and limits to your needs
          requests:
            cpu: 500m
            memory: 200Mi
            gpu.intel.com/i915: "1" 
          limits:
            cpu: 2000m
            memory: 2Gi
            gpu.intel.com/i915: "1" 
      volumes:
      - name: media
        nfs:
          path: /mediastuff
          server: myserver.example.local
      - name: transcode
        persistentVolumeClaim:
          claimName: "clusterplex-transcode-pvc"
      - name: codecs
        persistentVolumeClaim:
          claimName: "clusterplex-codec-pvc"
      - name: cache
        persistentVolumeClaim:
          claimName: "clusterplex-cache-pvc"
          readOnly: true
      - name: drivers
        persistentVolumeClaim:
          claimName: "clusterplex-drivers-pvc"
          readOnly: true

Folders mounted inside Worker. Touch test for RO verify.

root@clusterplex-worker-0:/# ls -al /config/Library/Application\ Support/Plex\ Media\ Server/
total 10
drwxr-xr-x 4 abc abc 4096 Sep 11 13:43 .
drwxr-xr-x 3 abc abc 4096 Sep 11 13:43 ..
drwxrwxrwx 8 abc abc 1024 Sep 11 13:54 Cache
drwxrwxrwx 3 abc abc 1024 Sep 11 13:43 Driver
root@clusterplex-worker-0:/# touch /config/Library/Application\ Support/Plex\ Media\ Server/Cache/test
touch: cannot touch '/config/Library/Application Support/Plex Media Server/Cache/test': Read-only file system

Remote VAAPI Transcode Success:

JobPoster connected, announcing
Orchestrator requesting pending work
Sending request to orchestrator on: http://clusterplex-orchestrator:3500
Remote Transcoding was successful
Calling external transcoder: /app/transcoder.js
ON_DEATH: debug mode enabled for pid [1977]
Local Relay enabled, traffic proxied through PMS local port 32499
Setting VERBOSE to ON
Sending request to orchestrator on: http://clusterplex-orchestrator:3500
cwd => "/transcode/Transcode/Sessions/plex-transcode-ba2f8489-11e0-4fab-b08d-31f4b42686ae-6c51bcab-01cf-4780-b61e-b99f21fb343a"
args => 

....BLAHBLAHBLAHBLAH...

"LIBVA_DRIVERS_PATH":"/config/Library/Application Support/Plex Media Server/Cache/va-dri-linux-x86_64"

...BLAHBLAHBLAHBLAH... 

FFMPEG_HWACCEL":"vaapi"

...BLAHBLAHBLAH...

"FFMPEG_EXTERNAL_LIBS":"/config/Library/Application\\ Support/Plex\\ Media\\ Server/Codec**s/8217c1c-4578-linux-x86_64/","TRANSCODER_VERBOSE":"1"}

Hope this helps

Sep 11 '23 18:09 audiophonicz

@audiophonicz that's an extremely clever approach, love it! :)

Now, I've finally set up a similar environment to test this, and have also been seeing the same issue, as well as trying to identify a few workarounds. Because I believe there may be an issue with this approach of depending on data from the main PMS. I believe this would only work if where you run PMS also has the same hardware, meaning, an intel iGPU as well. It seems that Plex creates the content of it's Drivers directory during initialization, based on the hardware available.

If that's the case, then there may be one other alternative approach, that doesn't depend on sharing Drivers and the Cache between PMS and the workers. And it's to initialize PMS on the workers at startup and then kill it once the local config has been created (I believe the linuxserver image does something along those lines too) so that the drivers for its hardware are downloaded. I've manually tried it and it would appear to work, however, we have to be careful with doing that, as we can only do that if the config is NOT being shared with the main PMS. I'm guessing it could probably destroy or corrupt that real config, so this only applies to a standalone worker that is not sharing configs as shown above.

If this actually works I may add an additional optional parameter to force a PMS initialization on the Workers, but the default will be to not do it, to avoid breaking working installations as the ones mentioned above here.

Now, question to you @audiophonicz and @todaywasawesome, when HW Transcoding on the worker with your working setups, does Plex show that it's being transcoded by HW or is it obvlivious to it? In my initial test it's just saying "Transcode" not "Transcode (hw)".

Sep 14 '23 14:09 pabloromeo

@pabloromeo It's been transcode (hw) for me. Making sure to mount the needed hardware of course.

I do have a concern that it might be limiting based on license. HW is a pro feature, so if Plex doesn't initialize as pro, it wouldn't enable HW transcoding. Might be able to use a claim key.

Sep 14 '23 15:09 todaywasawesome

So, weird Update: my method works but ONLY if the worker container is on the same physical node as the pms container. Theres no difference in the logs until it actually connects and starts to stream, then the remote workers simply kill the child process. I can even see the tile flash up in the pms Dashboard for half a second, then it disappears and tries another worker. when it finally gets to the worker on the same physical node, the logs pick up from the "segment:chunk-00000" and start playing.

[tcp @ 0x7ff2039fd440] Successfully connected to 10.10.2.20 port 32499
[AVIOContext @ 0x7ff203887cc0] Statistics: 57 bytes written, 0 seeks, 1 writeouts
[segment @ 0x7ff20d4356c0] segment:'chunk-00000' starts with packet stream:0 pts:274024 pts_time:274.024 frame:0
Killing child processes for task 35326182-1edb-49d9-86a4-9079d2e90e3d
Removing process from taskMap

@todaywasawesome can you confirm you can transcode on a worker container on a different physical node than pms when sharing the entire config? I'm thinking youre right about the PlexPass thing, and mine is matching the IP or something and only allowing it on the same node.

@pabloromeo
Yes, i have 6 identical nodes so I was counting on pms downloading the driver for my workers. Your approach of quick-init might be a better direction, but if server config and the existence of PlexPass is indeed interfering with HW transcoding on the remote workers, then a driver download alone might not work

Also, while transcoding on Worker-1

Sep 14 '23 19:09 audiophonicz

Can you check the logs on the workers? That might shed some light on what's going on.

Regarding plex Pass it's hard to say how they validate it. The X_PLEX_TOKEN should be reaching the worker and I believe it gets validated by a callback to PMS (through the relay). Unless something within that flow is broken. But without errors in logs it's quite difficult to identify. Maybe enabling debug logging in plex itself and seeing the messages in it's UI console.

Sep 14 '23 20:09 pabloromeo

I'll share my logs soon. My cluster is down for ISP issues ATM.

Sep 14 '23 23:09 todaywasawesome

TL;DR; I got remote HW transcoding working pretty reliably by flipping my original workaround and providing the workers the entire /config PVC without readOnly (so far) but sub-mounting the /Plugin Support/ dir (with the databases and whatnot) to only the PMS container as a separate PVC ReadWriteOnce. One thing I still have to work on here is the pid file overwrite.

Long: Ok so some weird stuff happened after my last post, 60 seconds after I comment one of the workers (on another node) got stuck and was the only worker being used, but HW transcodes not only worked, they were damn near instant. Unforunately after restarting that pod all that went away.. but it led me to my other issue I opened about transcode processes not stopping.

Anyways, I made some progress on my remote HW transcodes. Providing just the drivers for HW transcode doesnt seem to be enough, as it would only work on the same machine as my pms pod. Seeing that it seemed to work for todaywasawesome by sharing the whole config dir, which happens to contain a token file and the preferences.xml with the machineid UUID, i tried his method, and was riddled with SQLite db slow; waiting or some such logs. So, I flipped my original method and created a single additional PVC just for the databases in the /Plugin Support/ folder to essentially carve them out of the main /config folder and it seems to have worked.

I am currently playing 7 plays simultaneously across 3 workers: 3x direct play HEVC10 3x HEVC10 SW decode > H264 HW encode 1x HEVC8 HW decode > H264 HW encode

I apparently have a bunch of devices that cant HW decode HEVC10 and it really pushes my little i3-6100U nodes, so they take a good 30-45 seconds to start playing, but it does work. Every now and then one play will freeze or fail and need to retry (pretty sure its HEVC10 playing havok), but for the most part auto-play next and seeks are working as well. 99% of my stuff is H264, I only found 1 with HEVC8 and HEVC10 so I should be good with this setup.

I do still want to try to separate out the pid file so the worker isnt constantly deleting and overwriting each others pid file. It doesnt seem to hurt right now but its not optimal.

Sep 16 '23 19:09 audiophonicz

The workaround I tried is copying the folder over manually from a temp config directory to the config directory. That way the worker can do whatever it wants with the local db, it's trashed anyway.

Not great still.

Sep 16 '23 19:09 todaywasawesome

Ok guys I need some insight here. I still for the life of me cant get a worker to play if its not on the same node as PMS. its driving me mad.

Weird thing is, if both PMS and Worker-0 are on NODE1, Direct Plays will Direct Play, and Transcodes will Transcode, HW or SW, life is good.

If i simply move PMS to NODE2 while Worker-0 is on NODE1, all plays break. Direct Plays try to Transcode, and all Transcodes fail. Its not the /config dir. its not the /transcode or /codecs RWX speeds. Its purely on the same host or not, and I cant figure out what its using.

My only idea left is that the transcode job is using https://127.0.0.1 for the video transcode sessions and its not translating across pods/nodes:

[Req#745a/Transcode/JobRunner] Job running: FFMPEG_EXTERNAL_LIBS='/config/Library/Application\ Support/Plex\ Media\ Server/Codecs/8217c1c-4578-linux-x86_64/' X_PLEX_TOKEN=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx "/usr/lib/plexmediaserver/Plex Transcoder" -codec:0 mp3 -analyzeduration 20000000 -probesize 20000000 -i "/config/Library/Application Support/Plex Media Server/Metadata/TV Shows/3/d5dad9b0d635ffd439712c5dfd135b86a523101.bundle/Contents/_combined/themes/tv.plex.agents.series_e6fccc112eb130590ea2d245d869fedce8d276e9" -filter_complex "[0:0] aresample=async=1:ochl='stereo':rematrix_maxval=0.000000dB:osr=48000:rematrix_volume=-25.000000dB[0]" -map "[0]" -codec:0 libmp3lame -q:0 0 -f segment -segment_format mp3 -segment_time 1 -segment_header_filename header -segment_start_number 0 -segment_list "http://127.0.0.1:32400/video/:/transcode/session/3718081b-027b-4a0f-b1a1-fb99766945bf-64/cce15ce6-6af7-46fd-abbf-dd42e5e4609b/manifest?X-Plex-Http-Pipeline=infinite" -segment_list_type csv -segment_list_unfinished 1 -segment_list_size 5 -segment_list_separate_stream_times 1 -map_metadata -1 -map_chapters -1 "chunk-%05d" -y -nostats -loglevel quiet -loglevel_plex error -progressurl http://127.0.0.1:32400/video/:/transcode/session/3718081b-027b-4a0f-b1a1-fb99766945bf-64/cce15ce6-6af7-46fd-abbf-dd42e5e4609b/progress

Sep 27 '23 18:09 audiophonicz

My only idea left is that the transcode job is using https://127.0.0.1 for the video transcode sessions and its not translating across pods/nodes:

Plex definitely uses a loopback network for transcodes. On my freebsd plex jail if I don't give it a loopback address direct plays are fine but transcodes fail. (regardless of whether it needs to transcode audio or video). The address I give it is not 127.0.0.1 but it finds it okay.

If direct plays aren't working for you I'm not sure if this is the same problem but it very well might be. Also maybe the direct play you tested was transcoding audio?

Sep 28 '23 05:09 duskmoss

Ok guys I need some insight here. I still for the life of me cant get a worker to play if its not on the same node as PMS. its driving me mad.

Honestly I think this is probably a different issue and perhaps plex network configuration? - this issue is just about hardware transcoding failing, if you're not getting workers to transcode at all thats a more root problem

Sep 28 '23 05:09 duskmoss

Plex definitely uses a loopback network for transcodes. On my freebsd plex jail if I don't give it a loopback address direct plays are fine but transcodes fail. (regardless of whether it needs to transcode audio or video). The address I give it is not 127.0.0.1 but it finds it okay.

Thank you for your reply but my question is specifically about HW transcoding across physically separate kubernetes nodes, and Im not sure how a freebsd jail pertains. I do not see anywhere in this chart for transcode network settings, so I am not sure what this "it" you are giving a loopback address is.

Still looking for someone who has HW transcoding working across two physically separate nodes and what your plex network settings are for subnets and URL.

Sep 28 '23 13:09 audiophonicz

Sorry for the confusion the long and short of it is yes, thats where plex communicates with the transcoder. The transcoder stub here remaps that to a different container, and the nginx proxy passes it back in.

If direct play, and sw transcoding also are failing your issue isn't really about HW transcoding.. it's something else you have broken in the orchestration of the transcoder requests.

Sep 28 '23 17:09 duskmoss

Same issue here (Dockermod on unprivileged LXC on Proxmox).

Mounting /config/Library/Application Support/Plex Media Server/Cache and /config/Library/Application Support/Plex Media Server/Drivers inside the workers did the trick.

Thanks !

Oct 18 '23 11:10 flopon

Remapping just drivers and cache as RWX across pms and the workers fixed this issue for me.

Nov 08 '23 12:11 cpfarhood

This issue is stale because it has been open for 30 days with no activity.

Dec 09 '23 02:12 github-actions[bot]

Here to report a different setup that suffers from the same issue:

NAS Host with transcode and media shares exposed over NFS

Separate host running a docker-compose stack of one PMS instance, one worker, and one orchestrator. (no swarm).

Transcode and Media directories mounted over NFS as instructed (Read and Write).

Worker HW transcode fails (intel igpu), while "local" HW transcode succeeds (same physical intel igpu)

Dec 14 '23 06:12 albeltra

This issue is stale because it has been open for 30 days with no activity.

Jan 14 '24 02:01 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

Jan 29 '24 02:01 github-actions[bot]

Worker can't find iHD_drv_video.so