TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

add runtime_platform_compatibility flag

Open lanluo-nvidia opened this issue 1 year ago • 1 comments

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes # (issue)

Type of change

Please delete options that are not relevant and/or add your own.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist:

  • [ ] My code follows the style guidelines of this project (You can use the linters)
  • [ ] I have performed a self-review of my own code
  • [ ] I have commented my code, particularly in hard-to-understand areas and hacks
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have added tests to verify my fix or my feature
  • [ ] New and existing unit tests pass locally with my changes
  • [ ] I have added the relevant labels to my PR in so that relevant reviewers are notified

lanluo-nvidia avatar Aug 15 '24 21:08 lanluo-nvidia

when I locally test it I got the following error on https://github.com/pytorch/TensorRT/blob/main/py/torch_tensorrt/dynamo/runtime/_TorchTensorRTModule.py#L150:

torch_tensorrt [TensorRT Conversion Context]:logging.py:24 IRuntime::deserializeCudaEngine: Error Code 1: Serialization (Serialization assertion plan->header.pad == expectedPlatformTag failed.Platform specific tag mismatch detected. TensorRT plan files are only supported on the target runtime platform they were created on.)

lanluo-nvidia avatar Aug 15 '24 21:08 lanluo-nvidia

For this make sure to add the cross_compile flag as one of the engine invariant settings so that the cache doesnt try to pull them by accident

narendasan avatar Sep 11 '24 17:09 narendasan

  • torch_tensorrt.dynamo.cross_compile(model, inputs, *args, **kwargs) -> save EP to disk by calling torch_tensorrt.save()

peri044 avatar Oct 04 '24 01:10 peri044

Options:

  1. setup_engine -> execute_engine (Issue here is type is not supported in torch custom ops. For C++ ops, it is working). So POC with custom C++ ops should unblock this workflow.
  2. TRTEngine( check with the flag cross_compatibility)
  3. execute_serialized_engine() (downside: can't deserialize on every inference)

peri044 avatar Oct 04 '24 01:10 peri044