TensorRT
TensorRT copied to clipboard
flux model engine_from_bytes(bytes_from_path(self.engine_path)) OutOfMemory
from polygraphy.backend.trt import engine_from_bytes; when run engine_from_bytes(bytes_from_path(self.engine_path)) OutOfMemory on L40 with 1gpu with flux-dev,how to solve
Try to use trtexec, trt version >=8.6
Related issue: https://github.com/NVIDIA/TensorRT/issues/4205
@algorithmconquer the flux demo should now run on L40S as we have added memory optimizations in release/10.6. Can you please try again and update here?