✨[Feature] Will torch-TensorRT plan to support runtime subgraph optimization like TFTRT?
Currently, Torch-TensorRT can't optimize two-stage models like faster-rcnn, which get detection boxes firstly and then predict label for each box ; Because the number of detection boxes can only get at runtime, but torch-TensorRT need shape info to optimize classify subgraph by AOT.
Torch-TensorRT current fabllback-mode also can't solve it because of AOT optimiztion.
So will torch-TensorRT plan to support runtime subgraph optimization like TFTRT? Like TFTRT:
- Extracting torch-TensorRT supported subgraph at AOT stage;
- And create tensorRT engine plan at runtime when getting runtime shape info.
- To improve efficiency , engine cache binding with shape info is needed.
- Others like fallback execution is needed too.
Addtional info:
- torchscript provides custom FusionGroup utilities in torch/csrc/jit/passes/graph_fuser.cpp, so subgraph extraction maybe not too hard;
- And implementing fallback execution by reference to prim::FusionGroup.
@borisfom Is this related to that multi-engine usecase you were talking about?
This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days
This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days
This is a constraint on data dependent shapes (DDS), currently slated for v1.4 end of year.
This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days
DDS without fallback is supported in Torch-TRT v1.3. Please v1.3 if your model is supported end to end. If not, DDS with fallback is tentative support for v1.4 Q1'23.