vs-mlrt how to enable MULTI_DEVICE_SAFE

In vs-trt, how do I enable multi-device safe mode?

torch_tensorrt.runtime.multi_device_safe_mode

It was enabled by default in earlier versions, but since 10.2(?) it's disabled by default. But it makes it impossible to use my Voltas with Turings together.

Dec 27 '24 08:12 efschu

vs-trt does not rely on PyTorch at all so I don't know what you mean by enable it. You might be confusing it with HolyWu's plugins?

Speaking of multi-device inference, you should be able to do that in vs-trt using static scheduling.

Dec 27 '24 13:12 WolframRhodium

When using multiple GPUs from different generation it "errors out" with ICudaEngine::createExecutionContext: Error Code 1: Myelin ([version.cpp:operator():80] Compiled assuming that device 0 was SM 70, but device 0 is SM 75 SM70 is Volta (V100), SM75 is Turing (2080ti) I use code like following:

stream0 = core.std.SelectEvery(core.trt.Model(clip, engine_path="/root/realesr-general-wdn-x4v3_opset16_2080ti.engine", num_streams=3, device_id=0), cycle=2, offsets=0)
stream1 = core.std.SelectEvery(core.trt.Model(clip, engine_path="/root/realesr-general-wdn-x4v3_opset16_V100.engine", num_streams=3, device_id=1), cycle=2, offsets=1)
clip = core.std.Interleave([stream0, stream1])

MULTI_DEVICE_SAFE_MODE is a variable in TensorRT, and, if true, it checks for CUDA compatibility every time it is called. Its default is false nowadays, but it has been ture in the past (or it didnt exist and it checked it every time), where it was no problem to do trt.Model with different SM architectures.

Have a look here: https://github.com/pytorch/TensorRT/blob/main/core/runtime/runtime.cpp

bool MULTI_DEVICE_SAFE_MODE = false;

So my question is, where to set this for the vapoursynth ? I guess when building libvstrt.so - but how?

Dec 27 '24 14:12 efschu

That's a very interesting error. Could you try to set the environment variable CUDA_VISIBLE_DEVICE to 1 before building the engine for V100 using trtexec (and in this case you should not be required to set the device id in trtexec's command line), and then try the current code again?

Dec 27 '24 16:12 WolframRhodium

That's a very interesting error. Could you try to set the environment variable CUDA_VISIBLE_DEVICE to 1 before building the engine for V100 using trtexec, and then try the current code again?

I already tried that. Same error

Even when I change to V100 stream0 and 2080ti stream1 it says

Compiled assuming that device 0 was SM 75, but device 0 is SM 70

Dec 27 '24 16:12 efschu

I believe the flag you mentioned is internal to pytorch-tensorrt only, but let me check it carefully.

EDIT: yep I believe it is only a pytorch-tensorrt warning that has nothing to do with the trt library itself.

Anyway, how did you produce the engines exactly? Streams have nothing to do with devices in this case.

Dec 27 '24 16:12 WolframRhodium

trtexec --fp16 --onnx=./realesr-general-wdn-x4v3_opset16.onnx --minShapes=input:1x3x574x720 --optShapes=input:1x3x574x720 --maxShapes=input:1x3x574x720 --saveEngine=./realesr-general-wdn-x4v3_opset16_<card_name>.engine --tacticSources=+CUDNN,-CUBLAS,-CUBLAS_LT --skipInference --useCudaGraph --noDataTransfers --builderOptimizationLevel=5 --infStreams=2

Dec 27 '24 16:12 efschu

Anyway, how did you produce the engines exactly? Streams have nothing to do with devices in this case.

Yes that's true, but it shows me, that SM "level" is set to first device being called, and then errors out on next device with different SM "level"

Dec 27 '24 16:12 efschu

Thanks.

Dec 28 '24 06:12 WolframRhodium

For now I setup a new environment with Tensor RT 8.6.1 and CUDA 11.8 and it works "again"

I dunno exactly with which version they set MULTI_DEVICE_SAFE_MODE to false by default. So I went back to the versions I know they work.

Dec 29 '24 13:12 efschu

That's probably because the Myelin optimiser is not that aggressive in older TensorRT. Also Volta is not supported in TensorRT 10.5 and later.

Dec 29 '24 14:12 WolframRhodium

how to enable MULTI_DEVICE_SAFE_MODE ?