teith issues

Results 4 issues of


                                            teith

TensorRT Engine Recompilation with ONNX Runtime Backend on BERT Model for Variable Prompt Lengths

**Description** I am experiencing an issue where the TensorRT `.engine` file is recompiled every time there is a change in the prompt length when using the ONNX Runtime backend with...

module: backends

Request for Supporting minShapes/optShapes/maxShapes for TensorRT

**Is your feature request related to a problem? Please describe.** The ONNX Runtime backend in Triton Inference Server lacks direct support for minShapes, optShapes, and maxShapes in the model configuration...

Lower-than-Expected Performance Improvement with INT8 Quantization in TensorRT 10.0 on A100 GPU

## Description I recently attempted to utilize INT8 quantization with Stable Diffusion XL to enhance inference performance based on the claims made in a recent [TensorRT blog post](https://developer.nvidia.com/blog/tensorrt-accelerates-stable-diffusion-nearly-2x-faster-with-8-bit-post-training-quantization/), which suggested...

triaged

High WER and Incomplete Transcription Issue with Whisper

### System Info GPU properties: GPU name: NVIDIA A100 GPU memory size: 80 GB TensorRT-LLM branch: main TensorRT-LLM version: 0.11.0.dev2024052800 OS: Ubuntu 22.04 ### Who can help? @kaiyux, @byshiue ###...

bug

triaged

stale