Paul Hendricks
Results
19
issues of
Paul Hendricks
### System Info ## System Info - CPU architecture: x86_64 - Host Memory: 1TB - GPU: NVIDIA A100 80GB x8 - TensorRT-LLM version: v0.18.2 - Triton container: `nvcr.io/nvidia/tritonserver:25.04-trtllm-python-py3` - CUDA:...
bug
bug: changing top_logprobs from u32 to u8 to be consistent with chat