how to enable MULTI_DEVICE_SAFE_MODE ?
In vs-trt, how do I enable multi-device safe mode?
torch_tensorrt.runtime.multi_device_safe_mode
It was enabled by default in earlier versions, but since 10.2(?) it's disabled by default. But it makes it impossible to use my Voltas with Turings together.
vs-trt does not rely on PyTorch at all so I don't know what you mean by enable it. You might be confusing it with HolyWu's plugins?
Speaking of multi-device inference, you should be able to do that in vs-trt using static scheduling.
When using multiple GPUs from different generation it "errors out" with
ICudaEngine::createExecutionContext: Error Code 1: Myelin ([version.cpp:operator():80] Compiled assuming that device 0 was SM 70, but device 0 is SM 75
SM70 is Volta (V100), SM75 is Turing (2080ti)
I use code like following:
stream0 = core.std.SelectEvery(core.trt.Model(clip, engine_path="/root/realesr-general-wdn-x4v3_opset16_2080ti.engine", num_streams=3, device_id=0), cycle=2, offsets=0)
stream1 = core.std.SelectEvery(core.trt.Model(clip, engine_path="/root/realesr-general-wdn-x4v3_opset16_V100.engine", num_streams=3, device_id=1), cycle=2, offsets=1)
clip = core.std.Interleave([stream0, stream1])
MULTI_DEVICE_SAFE_MODE is a variable in TensorRT, and, if true, it checks for CUDA compatibility every time it is called. Its default is false nowadays, but it has been ture in the past (or it didnt exist and it checked it every time), where it was no problem to do trt.Model with different SM architectures.
Have a look here: https://github.com/pytorch/TensorRT/blob/main/core/runtime/runtime.cpp
bool MULTI_DEVICE_SAFE_MODE = false;
So my question is, where to set this for the vapoursynth ? I guess when building libvstrt.so - but how?
That's a very interesting error. Could you try to set the environment variable CUDA_VISIBLE_DEVICE to 1 before building the engine for V100 using trtexec (and in this case you should not be required to set the device id in trtexec's command line), and then try the current code again?
That's a very interesting error. Could you try to set the environment variable
CUDA_VISIBLE_DEVICEto 1 before building the engine for V100 using trtexec, and then try the current code again?
I already tried that. Same error
Even when I change to V100 stream0 and 2080ti stream1 it says
Compiled assuming that device 0 was SM 75, but device 0 is SM 70
I believe the flag you mentioned is internal to pytorch-tensorrt only, but let me check it carefully.
EDIT: yep I believe it is only a pytorch-tensorrt warning that has nothing to do with the trt library itself.
Anyway, how did you produce the engines exactly? Streams have nothing to do with devices in this case.
trtexec --fp16 --onnx=./realesr-general-wdn-x4v3_opset16.onnx --minShapes=input:1x3x574x720 --optShapes=input:1x3x574x720 --maxShapes=input:1x3x574x720 --saveEngine=./realesr-general-wdn-x4v3_opset16_<card_name>.engine --tacticSources=+CUDNN,-CUBLAS,-CUBLAS_LT --skipInference --useCudaGraph --noDataTransfers --builderOptimizationLevel=5 --infStreams=2
Anyway, how did you produce the engines exactly? Streams have nothing to do with devices in this case.
Yes that's true, but it shows me, that SM "level" is set to first device being called, and then errors out on next device with different SM "level"
Thanks.
For now I setup a new environment with Tensor RT 8.6.1 and CUDA 11.8 and it works "again"
I dunno exactly with which version they set MULTI_DEVICE_SAFE_MODE to false by default. So I went back to the versions I know they work.
That's probably because the Myelin optimiser is not that aggressive in older TensorRT. Also Volta is not supported in TensorRT 10.5 and later.