Distributed single-file transcoding
Is your feature request related to a problem? Please describe. It's possible that the individual nodes in the cluster may not be powerful enough to transcode 4K video in real time. Maybe a single node can only transcode 5 seconds of "video time" in 10 seconds of "real time", not enough to keep up with continuous playback.
Describe the solution you'd like Would be great if a single transcode job could be "chunked" and distributed amongst the cluster. For example, the video could be split into 5-second chunks, each sent to a node to transcode and then recombined by the orchestrator. In the example above, three nodes working together would be able to transcode 15 seconds of "video time" in 10 seconds of "real time", which is sufficient to enable continuous playback.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
Would it be worthwhile to configure the excempt-issue-labels here:
- https://github.com/pabloromeo/clusterplex/blob/master/.github/workflows/issues.yml#L14-L22
The label could then be applied to this issue to prevent it auto-closing.
(I would also not be offended if you allow the issue to close as wont-do. I am not in terrible need of this feature, so please don't feel the need to keep it open for my sake 😊)
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
if i'm not mistaken the env setting STREAM_SPLITTING still exist? Does this setting not work or is it verry unstable?
so i tried to implement and it somewhat works but maybe @pabloromeo you could point me to the right direction. So from what i can tell it is able to send tasks to different servers but the last send tasks just keeps on server3? My fork is here: https://github.com/FelixClements/clusterplex
plex_orchestrator.1.j2cqglzqo0i6@server02 | ON_DEATH: debug mode enabled for pid [1]
plex_orchestrator.1.j2cqglzqo0i6@server02 | Initializing orchestrator
plex_orchestrator.1.j2cqglzqo0i6@server02 | Using Worker Selection Strategy: LOAD_TASKS
plex_orchestrator.1.j2cqglzqo0i6@server02 | Stream-Splitting: ENABLED
plex_orchestrator.1.j2cqglzqo0i6@server02 | Setting up websockets
plex_orchestrator.1.j2cqglzqo0i6@server02 | Ready
plex_orchestrator.1.j2cqglzqo0i6@server02 | Server listening on port 3500
plex_orchestrator.1.j2cqglzqo0i6@server02 | Client connected: cjB34ZUtXt7VccYwAAAB
plex_orchestrator.1.j2cqglzqo0i6@server02 | Registering worker 3914952c-8cf3-42ce-b006-1a4fd63f492a|plex-worker-server04
plex_orchestrator.1.j2cqglzqo0i6@server02 | Registered new worker: 3914952c-8cf3-42ce-b006-1a4fd63f492a|plex-worker-server04
plex_orchestrator.1.j2cqglzqo0i6@server02 | Client connected: QiHgJdawYKOyoNuyAAAD
plex_orchestrator.1.j2cqglzqo0i6@server02 | Registering worker a9c36c4d-ffee-4cd3-8496-c4995836c664|plex-worker-server02
plex_orchestrator.1.j2cqglzqo0i6@server02 | Registered new worker: a9c36c4d-ffee-4cd3-8496-c4995836c664|plex-worker-server02
plex_orchestrator.1.j2cqglzqo0i6@server02 | Client connected: 9yndKkzeOWSxHerZAAAF
plex_orchestrator.1.j2cqglzqo0i6@server02 | Registering worker 6d099096-c2c4-4456-90da-9ec77316d8ab|plex-worker-server03
plex_orchestrator.1.j2cqglzqo0i6@server02 | Registered new worker: 6d099096-c2c4-4456-90da-9ec77316d8ab|plex-worker-server03
plex_orchestrator.1.j2cqglzqo0i6@server02 | Client connected: svwp5Ag1FhX3iYbMAAAH
plex_orchestrator.1.j2cqglzqo0i6@server02 | Registering worker a106d06b-f6c5-44c5-8e95-6ecd30c4aa7e|plex-worker-server01
plex_orchestrator.1.j2cqglzqo0i6@server02 | Registered new worker: a106d06b-f6c5-44c5-8e95-6ecd30c4aa7e|plex-worker-server01
plex_orchestrator.1.j2cqglzqo0i6@server02 | Client connected: UAacu1rC39oBo0JRAAAJ
plex_orchestrator.1.j2cqglzqo0i6@server02 | Registered new job poster: 9f11430a-852d-4710-b6d3-090d7eb64546|3d4396666091
plex_orchestrator.1.j2cqglzqo0i6@server02 | Creating multiple tasks for the job
plex_orchestrator.1.j2cqglzqo0i6@server02 | All Args => -codec:0,h264,-codec:1,ac3,-analyzeduration,20000000,-probesize,20000000,-i,/data/path/to/file,-analyzeduration,20000000,-probesize,20000000,-i,/transcode/Transcode/Sessions/plex-transcode-f34792abc0a049d0-com-plexapp-android-889da67e-f296-4229-9e3d-3a898993bc9a/temp-0.srt,-filter_complex,[0:0]scale=w=480:h=240:force_divisible_by=4[0];[0]format=pix_fmts=yuv420p|nv12[1],-map,[1],-metadata:s:0,language=eng,-codec:0,libx264,-crf:0,22,-maxrate:0,541k,-bufsize:0,1082k,-r:0,23.975999999999999,-preset:0,veryfast,-x264opts:0,subme=2:me_range=4:rc_lookahead=10:me=dia:no_chroma_me:8x8dct=0:partitions=none,-force_key_frames:0,expr:gte(t,n_forced*8),-filter_complex,[0:1] aresample=async=1:ochl='stereo':rematrix_maxval=0.000000dB:osr=48000[2],-map,[2],-metadata:s:1,language=eng,-codec:1,libopus,-b:1,135k,-map,1:s:0,-metadata:s:2,language=eng,-codec:2,ass,-strict_ts:2,0,-map,0:t?,-codec:t,copy,-segment_format,matroska,-f,ssegment,-individual_header_trailer,0,-flags,+global_header,-segment_header_filename,header,-segment_time,8,-segment_start_number,0,-segment_copyts,1,-segment_time_delta,0.0625,-segment_list,http://server:32499/video/:/transcode/session/f34792abc0a049d0-com-plexapp-android/889da67e-f296-4229-9e3d-3a898993bc9a/manifest?X-Plex-Http-Pipeline=infinite,-segment_list_type,csv,-segment_list_size,5,-segment_list_separate_stream_times,1,-segment_list_unfinished,1,-segment_format_options,output_ts_offset=10,-max_delay,5000000,-avoid_negative_ts,disabled,-map_metadata:g,-1,-map_metadata:c,-1,-map_chapters,-1,media-%05d.ts,-start_at_zero,-copyts,-vsync,cfr,-y,-nostats,-loglevel,verbose,-loglevel_plex,verbose,-progressurl,http://server:32499/video/:/transcode/session/f34792abc0a049d0-com-plexapp-android/889da67e-f296-4229-9e3d-3a898993bc9a/progress
plex_orchestrator.1.j2cqglzqo0i6@server02 | Args => segment_time: 8, ss: NaN, min_seg_duration: 10, skip_to_segment: NaN, segment_start_number: 0
plex_orchestrator.1.j2cqglzqo0i6@server02 | Queueing job b0d28a73-dfe1-4bce-8b78-d53c0f3be321
plex_orchestrator.1.j2cqglzqo0i6@server02 | Queueing task a23b4633-34ad-4007-b2a0-ef04802adc67
plex_orchestrator.1.j2cqglzqo0i6@server02 | Running task a23b4633-34ad-4007-b2a0-ef04802adc67
plex_orchestrator.1.j2cqglzqo0i6@server02 | Forwarding work request to a106d06b-f6c5-44c5-8e95-6ecd30c4aa7e|plex-worker-server01
plex_orchestrator.1.j2cqglzqo0i6@server02 | Received update for task a23b4633-34ad-4007-b2a0-ef04802adc67, status: received
plex_orchestrator.1.j2cqglzqo0i6@server02 | Received update for task a23b4633-34ad-4007-b2a0-ef04802adc67, status: inprogress
plex_orchestrator.1.j2cqglzqo0i6@server02 | Client connected: cDnq_-Qfd_HFFhiiAAAL
plex_orchestrator.1.j2cqglzqo0i6@server02 | Registered new job poster: 04e21e59-6039-422d-af98-0c388acec035|3d4396666091
plex_orchestrator.1.j2cqglzqo0i6@server02 | Creating multiple tasks for the job
plex_orchestrator.1.j2cqglzqo0i6@server02 | All Args => -codec:0,h264,-codec:1,ac3,-ss,224,-analyzeduration,20000000,-probesize,20000000,-i,/data/path/to/file,-ss,224,-analyzeduration,20000000,-probesize,20000000,-i,/transcode/Transcode/Sessions/plex-transcode-f34792abc0a049d0-com-plexapp-android-e71a4d30-c14b-49b4-8d5e-3e3eee1a39e5/temp-0.srt,-filter_complex,[0:0]scale=w=480:h=240:force_divisible_by=4[0];[0]format=pix_fmts=yuv420p|nv12[1],-map,[1],-metadata:s:0,language=eng,-codec:0,libx264,-crf:0,22,-maxrate:0,541k,-bufsize:0,1082k,-r:0,23.975999999999999,-preset:0,veryfast,-x264opts:0,subme=2:me_range=4:rc_lookahead=10:me=dia:no_chroma_me:8x8dct=0:partitions=none,-force_key_frames:0,expr:gte(t,n_forced*8),-filter_complex,[0:1] aresample=async=1:ochl='stereo':rematrix_maxval=0.000000dB:osr=48000[2],-map,[2],-metadata:s:1,language=eng,-codec:1,libopus,-b:1,135k,-map,1:s:0,-metadata:s:2,language=eng,-codec:2,ass,-strict_ts:2,0,-map,0:t?,-codec:t,copy,-segment_format,matroska,-f,ssegment,-individual_header_trailer,0,-flags,+global_header,-segment_header_filename,header,-segment_time,8,-segment_start_number,28,-segment_copyts,1,-segment_time_delta,0.0625,-segment_list,http://server:32499/video/:/transcode/session/f34792abc0a049d0-com-plexapp-android/e71a4d30-c14b-49b4-8d5e-3e3eee1a39e5/manifest?X-Plex-Http-Pipeline=infinite,-segment_list_type,csv,-segment_list_size,5,-segment_list_separate_stream_times,1,-segment_list_unfinished,1,-segment_format_options,output_ts_offset=10,-max_delay,5000000,-avoid_negative_ts,disabled,-map_metadata:g,-1,-map_metadata:c,-1,-map_chapters,-1,media-%05d.ts,-start_at_zero,-copyts,-y,-nostats,-loglevel,verbose,-loglevel_plex,verbose,-progressurl,http://server:32499/video/:/transcode/session/f34792abc0a049d0-com-plexapp-android/e71a4d30-c14b-49b4-8d5e-3e3eee1a39e5/progress
plex_orchestrator.1.j2cqglzqo0i6@server02 | Args => segment_time: 8, ss: 224, min_seg_duration: 10, skip_to_segment: NaN, segment_start_number: 28
plex_orchestrator.1.j2cqglzqo0i6@server02 | Creating segment 1
plex_orchestrator.1.j2cqglzqo0i6@server02 | Queueing job 496dcab9-60e0-473a-aed3-1d8b26a3fcdb
plex_orchestrator.1.j2cqglzqo0i6@server02 | Queueing task 3e33a397-ab9e-4728-9260-f9df28eba93b
plex_orchestrator.1.j2cqglzqo0i6@server02 | Running task 3e33a397-ab9e-4728-9260-f9df28eba93b
plex_orchestrator.1.j2cqglzqo0i6@server02 | Forwarding work request to 6d099096-c2c4-4456-90da-9ec77316d8ab|plex-worker-server03
plex_orchestrator.1.j2cqglzqo0i6@server02 | Received update for task 3e33a397-ab9e-4728-9260-f9df28eba93b, status: received
plex_orchestrator.1.j2cqglzqo0i6@server02 | Received update for task 3e33a397-ab9e-4728-9260-f9df28eba93b, status: inprogress
plex_orchestrator.1.j2cqglzqo0i6@server02 | Client disconnected: UAacu1rC39oBo0JRAAAJ
plex_orchestrator.1.j2cqglzqo0i6@server02 | Removing job-poster 9f11430a-852d-4710-b6d3-090d7eb64546|3d4396666091 from pool
plex_orchestrator.1.j2cqglzqo0i6@server02 | Killing job b0d28a73-dfe1-4bce-8b78-d53c0f3be321
plex_orchestrator.1.j2cqglzqo0i6@server02 | Telling worker a106d06b-f6c5-44c5-8e95-6ecd30c4aa7e|plex-worker-server01 to kill task a23b4633-34ad-4007-b2a0-ef04802adc67
plex_orchestrator.1.j2cqglzqo0i6@server02 | Job b0d28a73-dfe1-4bce-8b78-d53c0f3be321 killed
plex_orchestrator.1.j2cqglzqo0i6@server02 | Received update for task a23b4633-34ad-4007-b2a0-ef04802adc67, status: done
plex_orchestrator.1.j2cqglzqo0i6@server02 | Discarding task update for a23b4633-34ad-4007-b2a0-ef04802adc67
plex_orchestrator.1.j2cqglzqo0i6@server02 | Received update for task 3e33a397-ab9e-4728-9260-f9df28eba93b, status: done
plex_orchestrator.1.j2cqglzqo0i6@server02 | Task 3e33a397-ab9e-4728-9260-f9df28eba93b complete, result: false
plex_orchestrator.1.j2cqglzqo0i6@server02 | Task 3e33a397-ab9e-4728-9260-f9df28eba93b complete
plex_orchestrator.1.j2cqglzqo0i6@server02 | Job 496dcab9-60e0-473a-aed3-1d8b26a3fcdb complete, tasks: 1, result: false
plex_orchestrator.1.j2cqglzqo0i6@server02 | JobPoster notified
plex_orchestrator.1.j2cqglzqo0i6@server02 | Removing job 496dcab9-60e0-473a-aed3-1d8b26a3fcdb
plex_orchestrator.1.j2cqglzqo0i6@server02 | Job 496dcab9-60e0-473a-aed3-1d8b26a3fcdb complete
plex_orchestrator.1.j2cqglzqo0i6@server02 | Client disconnected: cDnq_-Qfd_HFFhiiAAAL
plex_orchestrator.1.j2cqglzqo0i6@server02 | Removing job-poster 04e21e59-6039-422d-af98-0c388acec035|3d4396666091 from pool
Hi! Awesome that you picked up that code and gave it a shot. Haven't looked at that stream splitting part in years lol. Back then the motivation was that I was trying some crazy ideas like trying to distribute transcoding across an army of raspberry pi machines lol. Now with things like quicksync readily available on even low power celeron machines, there wasn't that much of a need for splitting one transcode, but rather for scaling horizontally for more simultaneous transcodes, but always handling the streaming task on a single worker at a time.
If I remember correctly the idea had a bit more merit in processes like Optimizing a movie, which could be done in parallel in segments and then stitched back together. Not sure I ever got working behavior for live streaming though, since you really need the segments to be ready in order, and they'd all be writing to the same shared manifest which would probably overwrite each other. I'd really need to know waaay more about how ffmpeg works to pull something like that off. But I'll happily try to help however I can if you want to take a stab at it.
Also it's been years, so maybe somebody has already created something opensource which does part of that we could leverage or use as a dependency. Haven't has a time to look into it.
One thing I never go to exploring was maybe splitting the audio transcoding from the video one, to be able to do them on separate nodes.
Hey!
Thank you so much for your detailed response! Your insights actually pointed me in the direction of hardware transcoding instead of stream splitting, which was immensely helpful.
I went ahead and looked into hardware transcoding solutions and finally got it working, though it was quite challenging to implement within Docker Swarm. I ended up spending quite some time configuring the right setup and managing resource constraints specific to hardware acceleration.
I'm really curious about your setup. How did you manage to get hardware transcoding working in your environment? Any tips or advice you could share would be greatly appreciated!
to get it to work i had to deploy this stack https://github.com/allfro/device-mapping-manager/tree/master add these volumes to the server and workers
- /mnt/glusterfs/plex_server/Drivers:/config/Library/Application Support/Plex Media Server/Drivers
- /mnt/glusterfs/plex_server/Cache:/config/Library/Application Support/Plex Media Server/Cache
- /dev/dri/:/dev/dri/
Yeah, hardware transcoding and Docker Swarm is quite a bit more challenging. In my case I wasn't doing hardware transcoding when running on Swarm. However, I later migrated to kubernetes, and hardware transcoding was a bit easier. Since I'm using quicksync and not nvidia GPUs it required the install of the Intel Drivers on the cluster and managing the resource requests to have workers run on nodes with i915 devices IIRC.
Thanks for bringing up the topic of hardware transcoding with Docker Swarm and your experience with Kubernetes.
I actually faced some similar challenges with hardware transcoding as well. In my specific use case, I made some file changes in my fork of the project to prioritise Intel support.
If you're interested, feel free to check out my fork, where I’ve implemented these updates. You could maybe reference these changes or adapt them for your setup to possibly make the process easier for others wanting to get Intel hardware transcoding up and running in their environment.
Let me know if you need further details or clarification, I’d be happy to help!
Cheers
Hey!
Thank you so much for your detailed response! Your insights actually pointed me in the direction of hardware transcoding instead of stream splitting, which was immensely helpful.
I went ahead and looked into hardware transcoding solutions and finally got it working, though it was quite challenging to implement within Docker Swarm. I ended up spending quite some time configuring the right setup and managing resource constraints specific to hardware acceleration.
I'm really curious about your setup. How did you manage to get hardware transcoding working in your environment? Any tips or advice you could share would be greatly appreciated!
to get it to work i had to deploy this stack https://github.com/allfro/device-mapping-manager/tree/master add these volumes to the server and workers
- /mnt/glusterfs/plex_server/Drivers:/config/Library/Application Support/Plex Media Server/Drivers - /mnt/glusterfs/plex_server/Cache:/config/Library/Application Support/Plex Media Server/Cache - /dev/dri/:/dev/dri/
Hi Felix.
Can you give a bit more information as to which files to drop in the Drivers and Cache folders that you're mapping? I'm specifically looking into getting an Intel Arc 310 working with clusterplex in kubernetes, but no luck up until now.
Hi @Varashi, I did not need to drop any files in those folders. As far as i can see the divers needed for iGPU and yours are different, at least that is what intel says here -> https://dgpu-docs.intel.com/driver/client/overview.html#installing-client-gpus-on-ubuntu-desktop-22-04-lts
If I remember correctly plex server added the files on its own when it found the driver / device.
you could also try this -> https://dgpu-docs.intel.com/driver/client/overview.html#verifying-installation. I'm not sure if it was needed but i also added my user to the render and video groups
This issue is stale because it has been open for 30 days with no activity.
Hi @Varashi, I did not need to drop any files in those folders. As far as i can see the divers needed for iGPU and yours are different, at least that is what intel says here -> https://dgpu-docs.intel.com/driver/client/overview.html#installing-client-gpus-on-ubuntu-desktop-22-04-lts
If I remember correctly plex server added the files on its own when it found the driver / device.
you could also try this -> https://dgpu-docs.intel.com/driver/client/overview.html#verifying-installation. I'm not sure if it was needed but i also added my user to the render and video groups
I've tried starting clusterplex with these folders all shared via an NFS export, but unfortunately did not manage to get hardware transcoding working. For the moment I'm using standalone plex, but I'll try an extra deployment and see if I can reproduce the errors.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.